Parallel R: Introduction
Goal: Introduction some high-level aspects of using R in parallel relating to the cluster.
In the same spirit as this is not a course on learning the R language, this is not a section on developing parallelized code with any of the 10s of parallel related packages.
Instead it will detail some aspects to consider regards using our cluster.
Parallel Programming with R
The are 10s of potential packages that could be used, as a starting point we’d direct your to here: CRAN Task View: High-Performance and Parallel Computing with R.
One thing to consider with respect to what package you wish to explore is whether it provides multi-node functionality (such as Rmpi) or just multicore (parallel) on a single compute node, and/or cluster features.
Remember: Just asking for multiple nodes (and GPUs) won’t actually make your code run faster unless the underlying package can actually utilize them.
R parallel
Package: Overview
Building Rmpi from Source
Multicore: Detecting Cores
Detect Cores Example
# Create an interactive session that uses 8 cores:
[]$ salloc -A arcc -t 10:00 -c 8
salloc: Granted job allocation 861904
salloc: Nodes mbcpu-001 are ready for job
[@mbcpu-001 ~]$ module load gcc/13.2.0 r/4.4.0
# Check the slurm environment variable: SLURM_JOB_CPUS_PER_NODE
[@mbcpu-001 ~]$ echo $SLURM_JOB_CPUS_PER_NODE
8
# What does R detect?
[@mbcpu-001 ~]$ Rscript r_multicore.R $SLURM_JOB_CPUS_PER_NODE
[1] "Num of Cores: 8"
[1] "detectCores: 96"
[1] "mc.cores: 8"