What is the best way to pull simulations/data off of a cluster?

I’ve been running lots of simulations recently with signac-flow and I’ve now reached a point where I only care about looking at the ones that are doing really well (which I’m determining through some other code run on the cluster). What is the best way to grab the simulations that I want off of the cluster and put them on my local computer without bringing all of them? I’ve used rsync and globus before but it feels very clunky and inefficient like I could be doing something better.

Hi Rachel,

In order to possibly jump-start replies to your question, could you provide a bit more information? When you grab simulations and put them on your local computer, are you taking the output from the simulation of interest run on the HPC (i.e. the data) and transferring it to your desktop, then using this data to perform some post-processing? Or what aspect of the simulations are you grabbing from the HPC?

Thanks!
Sincerely,
torey
(Ask.CI Monitor)

Globus is scriptable through the command line interface (CLI), so it should be easy for you to write a script with whatever logic you use to determine what is “interesting” and transfer only the files that meet your criteria. See for example the CLI Quickstart Guide at https://docs.globus.org/cli/quickstart/ or the full reference documentation at https://docs.globus.org/cli/reference/

This of course begs the question as to why your results are not all interesting, but I assume that has to do with yoru science workflow!

1 Like

Just to add to what @alansill has explained above, I’m sharing the slides from a talk (not the exact ones but for a similar talk) I attended where the speaker was talking about autonomous data transfers via globus. This feature still lacks a gui (which is being developed) but you can talk to the speaker (Ryan Chard at Argonne National Lab) if you’re interested in using it!

Here are the slides : https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=8&ved=2ahUKEwjf9_3l4qnmAhUDVa0KHXsEDMAQFjAHegQICBAC&url=https%3A%2F%2Feresearchnz.figshare.com%2Farticles%2FGlobus_Automate_A_Distributed_Research_Automation_Platform%2F8066786%2Ffiles%2F15108098.pdf&usg=AOvVaw0j6Dt_9qHb59A9gB589Avi