...
Table of Contents |
---|
...
Overview
Module: Example
...
and is essentially a result of conflicts between compilers when using complex data types with the workaround of disabling the diagnostic error.
To resolve the issue, create and/or update the ~/.R/Makevars
file by adding the following lines:
Teton: Using Multiple CPUs
...
Typically, using parallel::detectCores()
to detect the number of available cores on a cluster node is a slight red herring. This returns the entire total number of cores of the node your job is allocated and not the actual number of cores you requested/allocated. For example, if you're sbatch script defines the following,
Code Block |
---|
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8 |
and you're allocated a standard Teton node that have 32 cores, parallel::detectCores()
will return a value of 32 and not 8 which is what you requested!
This will probably lead to unexpected results/failures when you try and run a function expecting 32 cores when only 8 are actually available.
To remove this problem you can use, and need to pass into your R script, the value of the $SLURM_JOB_CPUS_PER_NODE
slurm environment variable.
Below is an example of how to do this:
Batch Script: (fragments of what your script might look like):
Code Block |
---|
#!/bin/bash
...
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
...
echo "SLURM_JOB_CPUS_PER_NODE:" $SLURM_JOB_CPUS_PER_NODE
...
module load swset/2018.05 gcc/7.3.0 r/3.6.1
...
Rscript multiple_cpu_test.R $SLURM_JOB_CPUS_PER_NODE
... |
R Script: multiple_cpu_test.R
Code Block |
---|
args <- commandArgs(trailingOnly = TRUE)
if (!is.na(args[1])) {
num_of_cores <- args[1]
print(paste0("Num of Cores: ", num_of_cores))
}
print(paste0("detectCores: ", parallel::detectCores()))
options(mc.cores = num_of_cores)
print(paste0("mc.cores: ", getOption("mc.cores", 1L))) |
Slurm Output:
Code Block |
---|
SLURM_JOB_CPUS_PER_NODE: 8
...
[1] "Num of Cores: 8"
[1] "detectCores: 32"
[1] "mc.cores: 8" |