Greetings - I would like to create a Singularity container that includes Intel Parallel Studio but I’m having problems. I can start from a Docker container that has intel compiler and intel mpi inside. I also included a hello_world_mpi application, compiled in the container. The Docker container works fine:
$ sudo docker run --rm -it jedi-intel19-impi-hello:latest
root@1dfdbccc1110:/# mpirun -np 4 hello_world_mpi
Hello from rank 1 of 4 running on 1dfdbccc1110.
Hello from rank 2 of 4 running on 1dfdbccc1110
Hello from rank 0 of 4 running on 1dfdbccc1110
Hello from rank 3 of 4 running on 1dfdbccc1110
Then I build a singularity container from the docker container:
When you do anything with Singularity and MPI, because the environment is seamless from host to container, you need to have the exact same version of MPI installed on the host as you do in the container. This is different from docker that is completely isolated. See
Thanks @vsoch for your response. I am aware of these strategies but I believe that is primarily concerned with running across multiple nodes. That is indeed what I tried initially - see my related post on the sylabs GitHub site. But then I realized that mpi inside the container did not even work, which strikes me as a bigger problem.
We regularly generate gnu/openmpi and clang/mpich singularity containers that users can run on a single node and that are entirely independent of any MPI on the host. In fact, they are often run on laptops or virtual machines that do not have any MPI implementation on the host. These containers work fine if you run mpirun inside the container as I demonstrated in my post. Furthermore, we have gnu/openmpi, clang/mpich, intel17, and intel19 Charliecloud containers for which this approach also works. The only place where that hello world program fails is in intel Singularity containers.
Understood, thank you for that clarification. So if you aren’t concerned about scaling this, then what you should try first is to reproduce the docker run with singularity. To do that, you need to completely isolate the container from the host. For example, in addition to -e you should use --containall to prevent binds from the host:
I’m not sure why you had the “mpirun” at the end of the shell command? I have the container recipe you provided on GitHub, but it’s not reproducible because I don’t have the files on my host. Are you able to provide these so that I can see see if I can reproduce tomorrow? Could you also share the Docker container somewhere for me to pull as well, and (even better) link to the Dockerfile so I can confirm they are “the same” ?
Thanks again @vsoch for the suggestions. That mpirun at the end of the shell command was a typo. I tried your suggestion of --containall and a few other things (see the GitHub issue for details). But, I am still seeing the problem. I cannot share the files or the containers because they contain proprietary intel software. However, I can try to do a multi-stage build that leaves out the proprietary components. I’ll try to do that today.
@vsoch - Well, it turns out that the effort to build the multi-stage container solved the problem! I did a multi-stage Docker build that only includes the intel mpi runtime libraries in the second stage. Then I created a singularity image from that. Here is the Dockerfile:
Woot! It’s so great when it works out like that If you ever run into similar issues again and want another party to test with, ping me directly on github also @vsoch and provide the Singularity recipe and I can at least try to reproduce your error and play around. Happy mpi-ing!