Overview
R: is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Below are links to pages that are related to R. R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and macOS.
Using
Use the module name r
to discover versions available and to load the application.
Multicore
Typically, using parallel::detectCores()
to detect the number of available cores on a cluster node is a slight red herring. This returns the entire total number of cores of the node your job is allocated and not the actual number of cores you requested/allocated. For example, if you're sbatch script defines the following,
#SBATCH --nodes=1 #SBATCH --cpus-per-task=8
and you're allocated a standard Teton node that have 32 cores, parallel::detectCores()
will return a value of 32 and not 8 which is what you requested!
This will probably lead to unexpected results/failures when you try and run a function expecting 32 cores when only 8 are actually available.
To remove this problem you can use, and need to pass into your R script, the value of the $SLURM_JOB_CPUS_PER_NODE
slurm environment variable.
Example
Batch Script: (fragments of what your script might look like):
#!/bin/bash ... #SBATCH --nodes=1 #SBATCH --cpus-per-task=8 ... echo "SLURM_JOB_CPUS_PER_NODE:" $SLURM_JOB_CPUS_PER_NODE ... module load swset/2018.05 gcc/7.3.0 r/3.6.1 ... Rscript multiple_cpu_test.R $SLURM_JOB_CPUS_PER_NODE ...
R Script: multiple_cpu_test.R
args <- commandArgs(trailingOnly = TRUE) if (!is.na(args[1])) { num_of_cores <- args[1] print(paste0("Num of Cores: ", num_of_cores)) } print(paste0("detectCores: ", parallel::detectCores())) options(mc.cores = num_of_cores) print(paste0("mc.cores: ", getOption("mc.cores", 1L)))
Slurm Output:
SLURM_JOB_CPUS_PER_NODE: 8 ... [1] "Num of Cores: 8" [1] "detectCores: 32" [1] "mc.cores: 8"
R Packages
Below we will give some guidelines on how to install and use various R packages specifically on Teton.
Typically, packages will be installed in your home folder, within the
R
folder, under the platform versionx86_64-pc-linux-gnu-library
, then under amajor.minor
version (without the patch number) folder.
~/R/ x86_64-pc-linux-gnu-library/ 3.5/ 3.6/
Packages installed/built with one
major.minor
version will typically not work under another.
Installing Packages: Potential Problems
Trying to install install.packages("labdsv")
resulted in the following error:
/apps/u/gcc/4.8.5/intel/18.0.1-7cbw2rp/include/complex(77): error #308: member "std::complex<float>::_M_value" (declared at line 1187 of "/usr/include/c++/4.8.5/complex") is inaccessible _M_value = __z._M_value; ... compilation aborted for sptree.cpp (code 2) make: *** [sptree.o] Error 2 ERROR: compilation failed for package ‘Rtsne’ * removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/Rtsne’ ERROR: dependency ‘Rtsne’ is not available for package ‘labdsv’ * removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/labdsv’
This appears to be a reasonably common problem:
and is essentially a result of conflicts between compilers when using complex data types with the workaround of disabling the diagnostic error.
To resolve the issue, create and/or update the ~/.R/Makevars
file by adding the following lines:
R and Intel/MKL
On Teton we have versions of r (3.6.1/4.0.2) built with the Intel compiler and related MKL (Maths Kernel Library) that follows a request relating to Improving R Performance by installing optimized BLAS/LAPACK libraries.
To use:
[]$ module load r/4.0.2-intel []$ R R version 4.0.2 (2020-06-22) -- "Taking Off Again" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) > sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Storage Matrix products: default BLAS/LAPACK: /pfs/tsfs1/apps/el7-x86_64/u/intel/18.0.1/intel-mkl/2018.2.199-pti6y2y/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_rt.so
The Intel built version is dependent on the following modules core:
intel/18.0.1
intel-mkl/2018.2.199
The
module load r/*.*.*-intel
line will automatically load these modules for you.
Installing Packages to Use with Intel Version
The packages that you have installed for the standard versions of R will not work for the Intel version since they are built with different compilers. This means you will need to re-install the packages that you use.
If you potentially want to use both versions then you will need to create a second folder to install the Intel versions into.
This has been tested with R.3.6.1 intel version - a similar approach should apply for 4.0
On Teton, R packages are typically installed into:
~/R/ x86_64-pc-linux-gnu-library/ 3.5/ 3.6/
One way to install the Intel packages is the following:
Create a folder
~/R/intel/3.6/
~/R/ x86_64-pc-linux-gnu-library/ 3.5/ 3.6/ intel/ 3.6/
Use
module load r/3.6.1-intel
to load the Intel version.After starting R, use
.libPaths(c("~/R/intel/3.6/"))
to set your environment to use this folder.If you run
.libPaths()
you should see something of the form:
> .libPaths() [1] "/pfs/tsfs1/home/salexan5/R/intel/3.6" [2] "/pfs/tsfs1/apps/el7-x86_64/u/opt/R/3.6.1/intel/R-3.6.1/library"
Install packages as normal e.g.
install.packages("<the package's name>")
When running your R scripts you need to set
.libPaths(c("~/R/intel/3.6/"))
before loading any libraries to inform R where the appropriate packages can be found.Note: Currently R Package: RStan can not be installed using the intel version.