Your software/script uses threads and works locally, but crashes if run under the SGE's queue

From IBERS Bioinformatics and HPC Wiki
Jump to: navigation, search

SGE sets the stack size limit to the same value as h_vmem by default, and with a high values the software crashes because the threads causes a overflow (because each thread has its own stack).

So, the solution is the parameter "h_stack". Into the nodes (IBERS's HPC) the values are different for each one. Example:

#$ -l h_stack=10M

The Linux has the stack size 8M by default, 10M works for most programs (Some programs like Matlab, Blast2GO, or programs that use java may require higher values). Try some values for your work, start with 10M and increase the value until your programs fail. Maybe a high number of threads doesn't permit a high value of h_stack, it depends da amount of memory your job will work.

More information:

http://www.softpanorama.org/HPC/Grid_engine/Troubleshooting/index.shtml

https://stackoverflow.com/questions/12911841/kernel-stack-and-user-space-stack