I’m using TensorFlow on Cheaha, and my code isn’t using the GPU or I can’t locate the GPU or I’m receiving an error about missing GPUs. Why is this occurring and what can I do about it?
When using Tensorflow on Cheaha, if you see an error like Not creating XLA devices, tf_xla_enable_xla_devices not set
or are unable to find the GPU via the Tensorflow API, check the following.
-
Ensure you are correctly requesting GPUs with the SLURM flag
--gres=gpu:1
. Change1
to the appropriate number of GPUs needed. -
Ensure you are loading the appropriate CUDA toolkit using e.g.
module load cuda11.2/toolkit/11.2.2
. You can check which modules are available usingmodule avail toolkit
at the terminal. Be sure you are loading the correct module. To check which module is required for your version of Tensorflow, see the toolkit requirements chart here https://www.tensorflow.org/install/source#gpu.