By default, TensorFlow uses all CPU cores. To make sure the number of cores it uses, does not exceed the number of cores requested by a job on a cluster, one needs to set 2 parameters:
inter_op_parallelism_threads and intra_op_parallelism_threads.
What is the best way to determine the values for these parameters?
I set them as follows:
inter_op_parallelism_threads = 1
intra_op_parallelism_threads = n_cores -1
and it works,
but I wonder if there are some recommendations on how to choose these values to make my code run in most efficient way.
Also, does the way I should set these parameters depend on the number of GPUs I use (none, one, more)?