What is a segmentation fault (segfault) error?

I submitted a job to Cheaha and it made it through the queue. During processing it failed unexpectedly with a “segfault”, “segmentation fault” or “SIGSEGV” error. What does this mean, why does it happen, and what can be done about it?

A segfault, or segmentation fault, often causes unexpected termination of software. These errors are caused by software attempting to access memory that it isn’t allowed to access. An error typically includes a long string of hexadecimal characters like 0x000055ea4064c135, but with different values, and one or more of segmentation fault, segfault and/or SIGSEGV.

These issues are most commonly caused by programming errors, but can also be related to out-of-memory errors. If you encounter a segmentation fault, first try increasing the memory requested for the job. It may help to learn more about SLURM at our SLURM docs page and job efficiency at our job efficiency docs page to better guide future job requests.

If increasing memory does not help, then there may be a programming error in one of the software components being used in your workflow. The most common programming causes of segfaults are buffer overflows (reading past the end of an array) and dereferencing pointers to objects that no longer exist. In both cases, the operating system will catch the illegal memory access and terminate the software causing the segfault. The only possible fix for this is correcting the programming error, which we cannot provide direct support for. Please do reach out as we may be able to offer alternatives, workarounds, or advice that could point you in the right direction.