In AWS, what are class(es) of use-cases where batch is more efficient than a cluster build tool like CnfCluster?
In short:
AWS batch is great for large numbers of jobs with a simple work-flow pattern composed of autonomous jobs.
Not so great for many types of finer control, like administration of user-space or jobs that require more than one instance/node
The longer take :
The primary advantages over something like Cnfcluster and many container management tools like mesosphere is that you have scheduling (on Spot) headlessly, and this, as often is the case, is also the main disadvantage.
Batch is useful if you need a large number of (docker-compatible) container jobs that will run without interaction and want to queue them to spot - batch is exactly for that. This can be many things, two common uses are running large, one-time batch runs, and using another tool (often AWS’s lambda) to start a job(s) up in response to a triggering event. (These are pretty much the example life science use cases at https://aws.amazon.com/batch/use-cases/ )
You get queuing on AWS spot for limited effort, a config and maybe a lambda, plus your docker setup, without paying for a head node or unused uptime, and under some cases less setup/scheduling effort.
The drawbacks (or disadvantages) from this are all in a way a lack of control.
It isn’t very practical to separate Admin and user space, AWS’s IAM’s are not a replacement for groups, limits, or other finer-grained controls like there are with modern HPC schedulers such as permissions/ACLs or groups.
Additionally, the autonomous nature of the supported task can be an additional challenge in terms of coordination, the system may be ill-suited as a replacement for multi-node operations (such as message queues or MPI)