Tips for faster hyperparameter optimization?

mjheard · January 14, 2022, 10:18pm

Hi all,

I have a dataset of functional magnetic resonance imaging files upon which I am trying to run multivariate classification, using MATLAB and LibSVM. In my experiment, people were paying attention to two stimuli, from condition A and condition B. I would like to perform a searchlight, where I select a small number of adjacent voxels, imaged under condition A and B, and submit them to a classifier which tries to determine which samples came from condition A and condition B. Then, I calculate the average accuracy across this sphere of voxels. This means that for each sphere of voxels in the brain (~70,000), I am training and testing a model, and I am doing this 9 times.

The problem comes in with optimizing my hyperparameters. I would like to keep hyperparameters consistent across all 9 brains, so that I don’t have different models for each brain or for each sphere of voxels. I am making an assumption here, that any spheres of voxels that consistently provide discriminable activity are spheres that have information.

My classifiers (Gaussian naive Bayes, and linear support vector machine) have two hyperparameters. If I were to use a grid search function, examining 10 values of each parameter, then this would require me to run 100 models, and to evaluate these models for each of my brains (N = 9). It takes about 5 minutes per brain to execute code, which I believe means I would need 75 hours of computation time for each classifier.

Has anyone done a similar computation, and if so, do you have any recommendations on how to do so more quickly?

Matthew

csim · January 21, 2022, 3:41pm

Hey Matthew,

Great question! Sorry for the slow turnaround. We’ve had some issues with COVID and been busy with the start of the semester.

So first, the actual computation you need to brute force the parameter space and just do it, isn’t that bad. It should be fairly straightforward to implement it in such a way to do coarse grain parallelization and just throw a bunch of compute nodes at it. You would just run each of the 100 “models” for each brain as a single job and then batch them together with something like TACC launcher. Both Europa and Ganymede have TACC launcher set up.

If you do end up with a much larger parameter space in the future, you can look at Latin Hypercube Sampling and maybe drive it with Dakota. Dakota is a very powerful tool but it does require some setup to get going. Basically, you “teach” Dakota what your input file looks like using a templating system and then Dakota would allow you to try all sort of neat sampling algorithms and would generate your series of input files for you.

Let us know if you need some help with launcher or if there’s anything else we can do to assist.

Best,
Chris