Cloud native technologies and innovation

vsoch · January 18, 2022, 5:54pm

We’ve been thinking more about how to create hybrid environments and better bridge the gap between HPC, cloud, and local, and to support more modern workflows for researchers (e.g., think APIs, services, and automation).

I also know that for the centers I’ve been at, we have embraced some of these container clusters (e.g., Kubernetes) but that is typically done alongside an existing HPC, and it’s provided as a limited service (e.g., some subset of people can request a database but not much more than that). What are others thoughts about development environments using these tools? Or putting on our developer hats and imagining what the future might look like? And what centers are trying to innovate and how?

timothy.middelkoop · January 19, 2022, 3:47pm

To me Kubernetes is the way to go, it is well supported by the Open Source community and has been adopted by the container/automation/etc. industry as a common integration platform with most solutions providing open-source solutions + enterprise upgrades and support. It is also well supported across the major public cloud providers. There is even a national Kubernetes cluster (Nautilus Cluster - Nautilus – Pacific Research Platform ) for teaching and research. The other part of my vision is a good container building toolset such as GitLab or GitHub to build, host, and deploy containers (which can be deployed on a Kubernetes cluster) and training and awareness around how to use it effectively. Once a container has been built and the deployment files written (similar to HPC batch files) for Kubernetes it can be fairly easily deployed across many different Kubernetes systems from laptop, university, national/XSEDE/ to the public comercial cloud. Other solutions are either proprietary (public commercial cloud) or not well adopted in the community and all take a lot of time and effort to understand from a researcher/teaching perspective, a support/facilitation perspective, and a sysadmin management perspective.

Although Kubernetes has a fairly steep learning curve (partially because it uses different terminology), the onramp is fairly approachable by starting with Docker, Singularity/Apptainer, or other container systems and grow as your workflow requires (first just containerize your workflow, then grow).

I think the benefits of Kubernetes are 1) it is primarily open source, 2) it has a huge community, 3) it is by far the industry leader in container systems.

As for what I think it should look like? I think we need a lot of training and support; full featured environments (not just a simple portal or locked-down systems) for researchers and educators to use, explore, and innovate with; and a catalog of research computing and data applications that most researchers can easily deploy and use (for example workflow managers).

If you are interested in learning more there are a lot of resources out there. Try one of the Kubernetes on a laptop distributions such as MiniKube (Welcome! | minikube), K3s (https://k3s.io/), MicroK8s (https://microk8s.io/), or Kind (https://kind.sigs.k8s.io/). For even more fun drop it on a Raspberry Pi if you have one.

To learn more about Kubernetes checkout the CNCF landscape (https://landscape.cncf.io/) and guide (Guide - CNCF Cloud Native Interactive Landscape) and join the CNCF Research End User Group (CNCF Research End User Group).