I am running some software (xgboost), but not seeing any performance improvement by using multithreading. Can you suggest some ways for me to improve performance?
As a pointer, please make sure you are requesting the compute node using correct options. The number of cpu tasks should be equal to number of cores you expect to utilize. For example, here is a srun command (similar options in sbatch)-
srun -p short -N 1 -n 2 -c 16 --pty --export=ALL --mem=1Gb --time=01:00:00 /bin/bash
This requests 1 node with 32 CPUs , and supports 2 tasks with 16 cpus per task.