How to set intra_* and inter_* parallelism_threads parameters in TensorFlow

By default, TensorFlow uses all CPU cores. To make sure the number of cores it uses, does not exceed the number of cores requested by a job on a cluster, one needs to set 2 parameters:
inter_op_parallelism_threads and intra_op_parallelism_threads.

What is the best way to determine the values for these parameters?

I set them as follows:
inter_op_parallelism_threads = 1
intra_op_parallelism_threads = n_cores -1
and it works,
but I wonder if there are some recommendations on how to choose these values to make my code run in most efficient way.
Also, does the way I should set these parameters depend on the number of GPUs I use (none, one, more)?

I want to add a follow-up, related question. In my limited experience with this, I was not sure whether the cores (in CPU-based TensorFlow) were used efficiently. How can we know if the choices we have result in optimal CPU utilization (i.e. maximizing work so we can complete in the shortest time).

1 Like