Rstudio vs Rscript
The easiest way to think of R/Rscript and Rstudio on Monsoon is equating them to the Linux Command Line Interface (CLI) vs Ondemand. So, Rstudio supplies an interface into R and allows you to interactively develop programs that then may be run under R.
Rstudio
Rstudio is an integrated environment that may be used for preparing R scripts for analyzing data. The downside to using Rstudio is that it is not the best for large jobs that require more than 12 hours to run if you are off campus. It is best used for small jobs and for preparing scripts to use with larger jobs.
To run Rstudio, the first step is to log into ondemand.hpc.nau.edu and select Interactive Apps.
Once you have selected Rstudio, you will be presented with a form to fill out where you may select the amount of memory, the number of cores, and other settings for your Rstudio run.
Clicking the Launch button will queue up a job for the run. Depending upon the resources that you request, this could take a while to schedule. Once schedule you may connect to the Rstudio job by pressing the connect button.
Once you connect, you will be presented with the Rstudio GUI.
The main thing to remember is that your Rstudio session will only last for the time that you requested. During times when the cluster is busy, you could be waiting quite a while for resources to become available and if the resources became available in the middle of the night, you may not be able to take advantage of them.
What if I have long jobs to run?
As noted previously, Rstudio is best suited for testing and running small analyses. Once you have a working model tested, then a job script should be created and your analysis run via the Rscript program.
For example, if we require 2gb of ram and 4 cores for 36 hours for your analysis, the jobscript could look as follows:
#!/bin/bash
#SBATCH --job-name=analysis1
#SBATCH --chdir=/scratch/abc123/analysis1
#SBATCH --time=36:00:00
#SBATCH --n 4
#SBATCH --mem=2GB
#SBATCH --output=/scratch/abc123/analysis1/analysis1.txt
module load R/4.1.0
Rscript analysis1.R
If your script is named analysis1.sh then you would submit the job with:
sbatch analysis1.sh
The power of the cluster is in harnessing the ability to schedule jobs to run when resources become available. This also allows the analyses to be run without you having to watch over the analysis.