R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Below are links to pages that are related to R. R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and macOS.
Contents
Glossary
Frequently Asked Questions
Overview
Module: Example
...
Table of Contents |
---|
minLevel | 1 |
---|
maxLevel | 7 |
---|
type | flat |
---|
separator | pipe |
---|
|
Warning |
---|
As from the ARCC -Announcement that went out on the 23rd of May 2024: Here are several announcements about the vulnerability: CERT/CC Reports R Programming Language Vulnerability Vulnerability Note VU#238194 : “A vulnerability in the R language that allows for arbitrary code to be executed directly after the deserialization of untrusted data has been discovered. This vulnerability can be exploited through RDS (R Data Serialization) format files and .rdx files. An attacker can create malicious RDS or .rdx formatted files to execute arbitrary commands on the victim's target device.”
We encourage all of our R users to migrate to R version 4.4.0, and off of prior versions of R (4.3.x or earlier) at your earliest convenience. To assist you with this migration we have installed modules for R version 4.4.0 on the Beartooth HPC Environment: r/4.4.0 r-rmpi/0.7-1-ompi-r4.4
These modules are available via the “module …” commands as well as in OnDemand. The R/4.4.0 module is now the default R module on Beartooth, Loren and Wildiris HPC Clusters and will be the only R module available on MedicineBow. These modules include the R packages we typically included in our earlier R modules. If you have installed any libraries yourself you will need to re-install those libraries in R version 4.4.0, as those installations are version-specific. We intend to disable ARCC’s older R modules on Beartooth by Friday June 28th, 2024. If you have installed your own copy of R, via conda or some other method, you are welcome to use ARCC’s R modules. We encourage you to upgrade your personally installed version of R to 4.4.0. |
Note |
---|
Regards the announcement: Executive Summary: Updating compiler on Medicine. loading r/4.4.0 using the gcc/13.2.0 compiler you will see the following warning: Code Block |
---|
[]$ module load gcc/13.2.0 r/4.4.0
---------------------------- |
[]$ module spider r/3.5.1s
---------------------------- |
r: r/3.5.1s
----- You will need to load allfollowing dependent module(s) | onanyoneof the lines below before the "r/3.5.1s" module is available to load.
singularity/2.5.2
singularity/3.1.1
[]$ module spider r/3.6.1loaded: zlib-ng/2.1.4_zen4 (required by: gcc/13.2.0), zstd/1.5.5_zen4__programs_True (required by: gcc/13.2.0)
---------------------------- |
r: r/3.6.1
---------------------------- |
You will need to load all module(s) on any one of the lines below before the "r/3.6.1" module is available to load.
swset/2018.05 gcc/7.3.0
module load gcc/7.3.0 r/3.6.1 |
Using
Once the modules have been loaded:
Code Block |
---|
[]$ R
R version 3.6.1 (20190705) "Action of the Toes"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pclinux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for online help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> sessionInfo()
R version 3.6.1 (20190705)
Platform: x86_64pclinuxgnu (64bit)
Running under: Storage
Matrix products: default
BLAS: /pfs/tsfs1/apps/el7x86_64/u/gcc/7.3.0/r/3.6.13rtwrmw/rlib/R/lib/libRblas.so
LAPACK: /pfs/tsfs1/apps/el7x86_64/u/gcc/7.3.0/r/3.6.13rtwrmw/rlib/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF8 LC_COLLATE=en_US.UTF8
[5] LC_MONETARY=en_US.UTF8 LC_MESSAGES=en_US.UTF8
[7] LC_PAPER=en_US.UTF8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.1
> quit()
Save workspace image? [y/n/c]: n
[@tlog2 ~]$ |
Note:
This software is dependent on the following modules:
Code Block |
---|
[]$ module load r/3.6.1
Lmod has detected the following error: These module(s) exist but cannot be loaded as requested: "r/3.6.1"
Try: "module spider r/3.6.1" to see how to load the module(s). |
R Packages
Below we will give some guidelines on how to install and use various R packages specifically on Teton.
Typically, packages will be installed in your home folder, within the R
folder, under the platform version x86_64-pc-linux-gnu-library
, then under a major.minor
version (without the patch number) folder.
Code Block |
---|
~/R/
x86_64-pc-linux-gnu-library/
3.5/
3.6/ |
Packages installed/built with one major.minor
version will typically not work under another.
R Package: RStan
R and Intel/MKL
We have versions of r (3.6.1/4.0.2) built with the Intel compiler and related MKL (Maths Kernel Library) that follows a request relating to Improving R Performance by installing optimized BLAS/LAPACK libraries.
To use:
Code Block |
---|
|
[]$ module load r/4.0.2-intel
[]$ R
R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Storage
Matrix products: default
BLAS/LAPACK: /pfs/tsfs1/apps/el7-x86_64/u/intel/18.0.1/intel-mkl/2018.2.199-pti6y2y/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_rt.so |
The Intel built version is dependent on the following modules core:
Installing Packages to Use with Intel Version
Info |
---|
The packages that you have installed for the standard versions of R will not work for the Intel version since they are built with different compilers. This means you will need to re-install the packages that you use. If you potentially want to use both versions then you will need to create a second folder to install the Intel versions into. This has been tested with R.3.6.1 intel version - a similar approach should apply for 4.0
|
On Teton, R packages are typically installed into:
Code Block |
---|
~/R/
x86_64-pc-linux-gnu-library/
3.5/
3.6/ |
One way to install the Intel packages is the following:
Create a folder ~/R/intel/3.6/
Code Block |
---|
~/R/
x86_64-pc-linux-gnu-library/
3.5/
3.6/
intel/
3.6/ |
Use module load r/3.6.1-intel
to load the Intel version.
After starting R, use .libPaths(c("~/R/intel/3.6/"))
to set your environment to use this folder.
Code Block |
---|
> .libPaths()
[1] "/pfs/tsfs1/home/salexan5/R/intel/3.6"
[2] "/pfs/tsfs1/apps/el7-x86_64/u/opt/R/3.6.1/intel/R-3.6.1/library" |
Install packages as normal e.g. install.packages("<the package's name>")
When running your R scripts you need to set .libPaths(c("~/R/intel/3.6/"))
before loading any libraries to inform R where the appropriate packages can be found.
Note: Currently R Package: RStan can not be installed using the intel version.
Installing Packages: Potential Problems
Trying to install install.packages("labdsv")
resulted in the following error:
Code Block |
---|
/apps/u/gcc/4.8.5/intel/18.0.1-7cbw2rp/include/complex(77): error #308: member "std::complex<float>::_M_value" (declared at line 1187 of "/usr/include/c++/4.8.5/complex") is inaccessible
_M_value = __z._M_value;
...
compilation aborted for sptree.cpp (code 2)
make: *** [sptree.o] Error 2
ERROR: compilation failed for package ‘Rtsne’
* removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/Rtsne’
ERROR: dependency ‘Rtsne’ is not available for package ‘labdsv’
* removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/labdsv’ |
This appears to be a reasonably common problem:
and is essentially a result of conflicts between compilers when using complex data types with the workaround of disabling the diagnostic error.
To resolve the issue, create and/or update the ~/.R/Makevars
file by adding the following lines:
Teton: Using Multiple CPUs
This warning can be ignored. But, we recommend that you use the gcc/14.2.0 version to removing this warning. |
Overview
R: is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Below are links to pages that are related to R. R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and macOS.
Using
Use the module name r
to discover versions available and to load the application.
Pre-Installed Libraries:
Some versions of r
have had common libraries pre-installed. To check, you can either try loading the library, or you can list all the libraries installed using:
Code Block |
---|
> packinfo <- installed.packages(fields = c("Package", "Version"))
> packinfo[, "Version", drop=F] |
Multicore
Typically, using parallel::detectCores()
to detect the number of available cores on a cluster node is a slight red herring. This returns the entire total number of cores of the node your job is allocated and not the actual number of cores you requested/allocated. For example, if you're sbatch script defines the following,
...
and you're allocated a standard Teton node that have 32 cores, parallel::detectCores()
will return a value of 32 and not 8 which is what you requested!
This will probably lead to unexpected results/failures when you try and run a function expecting 32 cores when only 8 are actually available.
To remove this problem you can use, and need to pass into your R script, the value of the $SLURM_JOB_CPUS_PER_NODE
slurm environment variable.Below is an example of how to do this:
Example
Batch Script: (fragments of what your script might look like):
Code Block |
---|
#!/bin/bash
...
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
...
echo "SLURM_JOB_CPUS_PER_NODE:" $SLURM_JOB_CPUS_PER_NODE
...
module load swset/2018.05 gcc/7.3.0 r/3.6.1
...
Rscript multiple_cpu_test.R $SLURM_JOB_CPUS_PER_NODE
... |
R Script: multiple multiple_cpu_test.R
Code Block |
---|
args <- commandArgs(trailingOnly = TRUE)
if (!is.na(args[1])) {
num_of_cores <- args[1]
print(paste0("Num of Cores: ", num_of_cores))
}
print(paste0("detectCores: ", parallel::detectCores()))
options(mc.cores = num_of_cores)
print(paste0("mc.cores: ", getOption("mc.cores", 1L))) |
...