Table of Contents | ||||||||
---|---|---|---|---|---|---|---|---|
|
Warning |
---|
As from the ARCC -Announcement that went out on the 23rd of May 2024: Here are several announcements about the vulnerability:
We encourage all of our R users to migrate to R version 4.4.0, and off of prior versions of R (4.3.x or earlier) at your earliest convenience. To assist you with this migration we have installed modules for R version 4.4.0 on the Beartooth HPC Environment:
These modules are available via the “module …” commands as well as in OnDemand. The R/4.4.0 module is now the default R module on Beartooth, Loren and Wildiris HPC Clusters and will be the only R module available on MedicineBow. These modules include the R packages we typically included in our earlier R modules. If you have installed any libraries yourself you will need to re-install those libraries in R version 4.4.0, as those installations are version-specific. We intend to disable ARCC’s older R modules on Beartooth by Friday June 28th, 2024. If you have installed your own copy of R, via conda or some other method, you are welcome to use ARCC’s R modules. We encourage you to upgrade your personally installed version of R to 4.4.0. |
Note | ||
---|---|---|
Regards the announcement: Executive Summary: Updating compiler on Medicine. loading
This warning can be ignored. But, we recommend that you use the |
Overview
R: is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Below are links to pages that are related to R. R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and macOS.
Using
Use the module name r
to discover versions available and to load the application.
Pre-Installed Libraries:
Some versions of r
have had common libraries pre-installed. To check, you can either try loading the library, or you can list all the libraries installed using:
Code Block |
---|
> packinfo <- installed.packages(fields = c("Package", "Version"))
> packinfo[, "Version", drop=F] |
Multicore
Typically, using parallel::detectCores()
to detect the number of available cores on a cluster node is a slight red herring. This returns the entire total number of cores of the node your job is allocated and not the actual number of cores you requested/allocated. For example, if you're sbatch script defines the following,
...
and you're allocated a standard Teton node that have 32 cores, parallel::detectCores()
will return a value of 32 and not 8 which is what you requested!
This will probably lead to unexpected results/failures when you try and run a function expecting 32 cores when only 8 are actually available.
To remove this problem you can use, and need to pass into your R script, the value of the $SLURM_JOB_CPUS_PER_NODE
slurm environment variable.
Example
Batch Script: (fragments of what your script might look like):
...
Code Block |
---|
SLURM_JOB_CPUS_PER_NODE: 8
...
[1] "Num of Cores: 8"
[1] "detectCores: 32"
[1] "mc.cores: 8" |
R Packages
Below we will give some guidelines on how to install and use various R packages specifically on Teton.
Typically, packages will be installed in your home folder, within the
R
folder, under the platform versionx86_64-pc-linux-gnu-library
, then under amajor.minor
version (without the patch number) folder.
Code Block |
---|
~/R/
x86_64-pc-linux-gnu-library/
3.5/
3.6/ |
Packages installed/built with one
major.minor
version will typically not work under another.
Installing Packages: Potential Problems
Trying to install install.packages("labdsv")
resulted in the following error:
Code Block |
---|
/apps/u/gcc/4.8.5/intel/18.0.1-7cbw2rp/include/complex(77): error #308: member "std::complex<float>::_M_value" (declared at line 1187 of "/usr/include/c++/4.8.5/complex") is inaccessible
_M_value = __z._M_value;
...
compilation aborted for sptree.cpp (code 2)
make: *** [sptree.o] Error 2
ERROR: compilation failed for package ‘Rtsne’
* removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/Rtsne’
ERROR: dependency ‘Rtsne’ is not available for package ‘labdsv’
* removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/labdsv’ |
This appears to be a reasonably common problem:
and is essentially a result of conflicts between compilers when using complex data types with the workaround of disabling the diagnostic error.
To resolve the issue, create and/or update the ~/.R/Makevars
file by adding the following lines:
R and Intel/MKL
On Teton we have versions of r (3.6.1/4.0.2) built with the Intel compiler and related MKL (Maths Kernel Library) that follows a request relating to Improving R Performance by installing optimized BLAS/LAPACK libraries.
To use:
Code Block | ||
---|---|---|
| ||
[]$ module load r/4.0.2-intel []$ R R version 4.0.2 (2020-06-22) -- "Taking Off Again" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) > sessionInfo() R version 4.0.2 (2020-06-22) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Storage Matrix products: default BLAS/LAPACK: /pfs/tsfs1/apps/el7-x86_64/u/intel/18.0.1/intel-mkl/2018.2.199-pti6y2y/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_rt.so |
The Intel built version is dependent on the following modules core:
intel/18.0.1
intel-mkl/2018.2.199
The
module load r/*.*.*-intel
line will automatically load these modules for you.
Installing Packages to Use with Intel Version
The packages that you have installed for the standard versions of R will not work for the Intel version since they are built with different compilers. This means you will need to re-install the packages that you use.
If you potentially want to use both versions then you will need to create a second folder to install the Intel versions into.
This has been tested with R.3.6.1 intel version - a similar approach should apply for 4.0
On Teton, R packages are typically installed into:
Code Block |
---|
~/R/
x86_64-pc-linux-gnu-library/
3.5/
3.6/ |
One way to install the Intel packages is the following:
Create a folder
~/R/intel/3.6/
Code Block |
---|
~/R/
x86_64-pc-linux-gnu-library/
3.5/
3.6/
intel/
3.6/ |
Use
module load r/3.6.1-intel
to load the Intel version.After starting R, use
.libPaths(c("~/R/intel/3.6/"))
to set your environment to use this folder.If you run
.libPaths()
you should see something of the form:
Code Block |
---|
> .libPaths()
[1] "/pfs/tsfs1/home/salexan5/R/intel/3.6"
[2] "/pfs/tsfs1/apps/el7-x86_64/u/opt/R/3.6.1/intel/R-3.6.1/library" |
...
Install packages as normal e.g. install.packages("<the package's name>")
...
When running your R scripts you need to set .libPaths(c("~/R/intel/3.6/"))
before loading any libraries to inform R where the appropriate packages can be found.
...