I think the question is very broad, perhaps too broad for a useful answer. Most of the ‘HPC’ clusters I know of are now doing far less traditional HPC and much more what would be considered ‘HTC’.
A scheduler/job manager that is good for tighly coupled MPI code that runs on thousands of nodes might not be such a great choice for a cluster that largely runs one-core jobs that last less than four hours.
So perhaps a better approach would be to describe what knids of jobs your users are bringing to your systems and then ask what would be a good setup to answer their needs?
We use Slurm here, but it has its drawbacks. As Ben pointed out, it is MPI focused, but many of its most notable features go largely unused and unneeded here. We’ve had some issues where, for example, its affinity plug-in may be causing more problems that it solves.
Most of our users do not want advanced features, they want simple operation. Those who want advanced features are more likely to try to get them from mpirun
. If the configuration can be kept simple, Slurm can certainly start a lot of jobs fast.
About half our jobs are one core on one node. More that 3/4 of jobs are single node, or should be. Only a few of our jobs require 64 cores or greater. Many people are blindly asking for core counts that aren’t even multiples of a node (some want processes to be powers of two, but most not, I think).
We have people who use launcher
from TACC as a kind of scheduler within a scheduler, though I think launcher
may be a bit of an orphan now. There are two PRs open, one from a year ago and one from a half-year ago, that don’t seem to have been reviewed. It seems to Just Work [TM], and in many ways it is easier for people to get started using it than, say, GNU parallel.
As Ben has, I’ve worked with HTCondor, and I think that for high-throughput jobs, it might be a good choice. Many places are using HTCondor and giving it access to submit jobs to their Slurm clusters. My experience has been mostly with OSG, and many of our HPC users would be impatient with the lag between submitting a job to HTCondor and its start. That may be somewhat tunable in the configuration, but it’s really designed for cranking thousands of jobs through not providing immediate gratification from the first job to start. 
I don’t see HTCondor and Slurm as being mutually exclusive, either. They seem designed for different things, and it seems that some places are able to use both in different contexts (Wisconsin is a good example). There is additional overhead for maintaining and monitoring two systems, but it’s perhaps worth considering. Your engineers might prefer Slurm, whereas the sequencers might prefer HTCondor.
I would not want to be on a framing crew with a cooper’s hammer, and I wouldn’t want to be setting barrel staves with a framer’s hammer. Maybe evaluating the tool can only be done when it is clear to which purpose it will be put?