Why can't TensorFlow find a GPU on Cheaha?

I’m using TensorFlow on Cheaha, and my code isn’t using the GPU or I can’t locate the GPU or I’m receiving an error about missing GPUs. Why is this occurring and what can I do about it?

When using Tensorflow on Cheaha, if you see an error like Not creating XLA devices, tf_xla_enable_xla_devices not set or are unable to find the GPU via the Tensorflow API, check the following.

  1. Ensure you are correctly requesting GPUs with the SLURM flag --gres=gpu:1. Change 1 to the appropriate number of GPUs needed.

  2. Ensure you are loading the appropriate CUDA toolkit using e.g. module load cuda11.2/toolkit/11.2.2. You can check which modules are available using module avail toolkit at the terminal. Be sure you are loading the correct module. To check which module is required for your version of Tensorflow, see the toolkit requirements chart here https://www.tensorflow.org/install/source#gpu.