R

As from the ARCC -Announcement that went out on the 23rd of May 2024:

Here are several announcements about the vulnerability: 

  • CERT/CC Reports R Programming Language Vulnerability

  • Vulnerability Note VU#238194 : “A vulnerability in the R language that allows for arbitrary code to be executed directly after the deserialization of untrusted data has been discovered. This vulnerability can be exploited through RDS (R Data Serialization) format files and .rdx files. An attacker can create malicious RDS or .rdx formatted files to execute arbitrary commands on the victim's target device.”

We encourage all of our R users to migrate to R version 4.4.0, and off of prior versions of R (4.3.x or earlier) at your earliest convenience. To assist you with this migration we have installed modules for R version 4.4.0 on the Beartooth HPC Environment:

  • r/4.4.0

  • r-rmpi/0.7-1-ompi-r4.4

These modules are available via the “module …” commands as well as in OnDemand. The R/4.4.0 module is now the default R module on Beartooth, Loren and Wildiris HPC Clusters and will be the only R module available on MedicineBow. These modules include the R packages we typically included in our earlier R modules. If you have installed any libraries yourself you will need to re-install those libraries in R version 4.4.0, as those installations are version-specific.

We intend to disable ARCC’s older R modules on Beartooth by Friday June 28th, 2024.

If you have installed your own copy of R, via conda or some other method, you are welcome to use ARCC’s R modules.  We encourage you to upgrade your personally installed version of R to 4.4.0.

Regards the announcement: Executive Summary: Updating compiler on Medicine. loading r/4.4.0 using the gcc/13.2.0 compiler you will see the following warning:

[]$ module load gcc/13.2.0 r/4.4.0 ------------------------------------------------------------------------------- The following dependent module(s) are not currently loaded: zlib-ng/2.1.4_zen4 (required by: gcc/13.2.0), zstd/1.5.5_zen4__programs_True (required by: gcc/13.2.0) -------------------------------------------------------------------------------

This warning can be ignored.

But, we recommend that you use the gcc/14.2.0 version to removing this warning.

Overview

  • R: is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Below are links to pages that are related to R. R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and macOS.

Using

Use the module name r to discover versions available and to load the application.

Pre-Installed Libraries:

Some versions of r have had common libraries pre-installed. To check, you can either try loading the library, or you can list all the libraries installed using:

> packinfo <- installed.packages(fields = c("Package", "Version")) > packinfo[, "Version", drop=F]

Multicore

Typically, using parallel::detectCores() to detect the number of available cores on a cluster node is a slight red herring. This returns the entire total number of cores of the node your job is allocated and not the actual number of cores you requested/allocated. For example, if you're sbatch script defines the following,

#SBATCH --nodes=1 #SBATCH --cpus-per-task=8

and you're allocated a standard Teton node that have 32 cores, parallel::detectCores() will return a value of 32 and not 8 which is what you requested!
This will probably lead to unexpected results/failures when you try and run a function expecting 32 cores when only 8 are actually available.
To remove this problem you can use, and need to pass into your R script, the value of the $SLURM_JOB_CPUS_PER_NODE slurm environment variable.

Example

Batch Script: (fragments of what your script might look like):

R Script: multiple_cpu_test.R

Slurm Output: