On our cluster, we have various ML programs installed for our users that require conda. When a user wants to use one of the programs, they must initialize conda, which puts that code in their .bashrc. However, if they want to run a different ML program that uses conda, they need to edit their .bashrc and remove the other conda initialization first. I’m wondering if there is a way around this?
Hi Keri,
I typically use multiple conda environments to manage different ML programs. Then all I need to do is put conda activate <environment>
in my slurm scripts to use a particular set of python packages.
https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#
If you need to have multiple independent anaconda installs, it looks like there are some work-arounds available. One option that might work is something like: Install packages in a conda environment on IU's high performance computers
You might also be able to specify the full path to the anaconda python version you want to use (e.g. ~/anaconda3/envs/<env>/bin/python
) and then also manually specify a couple of environment variables. I don’t recall off hand which environment variables need to be set though.
When you install conda you can disable the auto activation of the base environment in your .bashrc if that helps. After install you can use:
conda config --set auto_activate_base false
Then manually conda activate each time, as David says. There shouldn’t be a need to use .bashrc at all unless you want the same conda environment every time you login.
Thanks,
Mark
@david.matthews.1’s solution of separate Conda environments should work: it has worked for me, as well.
$ conda activate environA
$ run_prog_A.py
$ conda deactivate
$ conda activate environB
$ run_prog_B.py
$ conda deactivate
If you don’t like Conda, you can use separate pipenvs. However, I find that pipenv’s are better for dev than for end-user use. YMMV. Pipenv: Python Dev Workflow for Humans — pipenv 2021.5.29 documentation
A more “forceful” method would be to build separate Singularity containers for each program. The advantage there is that you can also switch Linux distros if at all necessary.
As others have said, use of .bashrc to set up a particular environment by default is not only non-ideal from the point of view that you mentioned, but can slow down logins substantially for accounts that have large or complex environments.
Here is advice that we have given people who run into this problem. Thanks to Misha Ahmadian of our Texas Tech HPCC staff for developing this material. Others have pointed to the useful tool at https://github.com/amaji/conda-env-mod that they have used with good success for this purpose.
Hope this helps.
Alan
———————————
Dear (account holder),
A very large Anaconda base environment appears to be set up in your account along with a .bashrc file that sets up this environment on each and every login. Ideally, you would set up a minimal base environment and even more ideally, not set it up by default in your .bashrc file but instead only source the conda setup shell script when you really want to use conda, and also arrange your conda environments so that they are separated by function, activating only the environment you need to perform an operation or set of closely related operations, and deactivating it when done before moving on to other steps in your workflow. Doing so will help to avoid the very slow setup that conda suffers from in large, complex environments.
If this is not possible for you, here is a suggestion from one of our staff members that can improve the login experience to HPCC login nodes for those who have conda init script in their .bashrc file by default and frequently use conda environment for Python package management. To initialize the conda environment upon login without automatically activating the “base” environment, you may follow the instructions below:
- After logging in to one of the HPCC login gateways, please deactivate the conda base env:
(base)$ conda deactivate
- Then disable the “Automatic Active Base” feature in the conda configuration:
$ conda config --set auto_activate_base False
- You can also check the conda configurations to make sure the new setup is in place:
$ conda config --show | grep auto_activate_base
- Log out, and log back into you account. You’ll see the “conda” command is still available in your session, but the (base) env is not activated yet. You may need to activate the “base” or any other conda environments whenever you need to do:
$ conda activate
For example:
$ conda activate base
(or better yet, a named environment suited for a particular set of commands, not a huge one with everything in it).
This approach would be more efficient and will allow you to activate the conda environments on-demand. Please note that with this setup in place, you’ll need to add the " conda activate " command in your job submission scripts in order to get your batch jobs to use the correct environment.