CPU binding: What are some appropriate uses?

Some colleagues were debating the merits of scheduler binding options, and we realized we lacked clarity as to when binding tasks to resources might be most beneficial. Below is a specific topic:

On most clusters, one can request a specific processor core binding (processor affinity). For example, with SLURM, one might include the --cpu-bind option with the srun command, to control task binding. On SGE-type of clusters, the option is -binding <binding_strategy>, where binding strategy could be linear, striding, explicit. (Based on the documentation, this latter option allows one to request a specific core binding.)

When would implementing cpu binding benefit a job? Are there specific software packages or examples of scripts for which the implementation of this option might improve the performance of a job? If so, can anyone offer descriptions of these scenarios? Any explanations regarding the mechanisms behind the increased performance would also be welcome!

Thank you!

Katia and Torey
BU/Mines

The only time I’ve seen binding matter was on hardware configurations that tightly coupled memory and CPU (e.g, an large SMP system). That particular system used a slower than memory speed bus to interconnect the “blades” that made up the system, so if you created a task that needed more memory than was on a single blade, you would see.major speed improvement from splitting it into threads, each bound to a cpu and only referencing on-blade memory. Without binding, cross-blade memory references would drag your process to a snail pace. On that same system, doing I/O from an on-blade controller or network card made a huge difference as well.

So, in short, my experience is you want binding when the hardware design makes it necessary for your application.

Cheers,
Ric