Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 20 Next »

Overview

  • R: is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. Below are links to pages that are related to R. R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and macOS.

Using

Use the module name r to discover versions available and to load the application.

Multicore

Typically, using parallel::detectCores() to detect the number of available cores on a cluster node is a slight red herring. This returns the entire total number of cores of the node your job is allocated and not the actual number of cores you requested/allocated. For example, if you're sbatch script defines the following,

#SBATCH --nodes=1
#SBATCH --cpus-per-task=8

and you're allocated a standard Teton node that have 32 cores, parallel::detectCores() will return a value of 32 and not 8 which is what you requested!
This will probably lead to unexpected results/failures when you try and run a function expecting 32 cores when only 8 are actually available.
To remove this problem you can use, and need to pass into your R script, the value of the $SLURM_JOB_CPUS_PER_NODE slurm environment variable.

Example

Batch Script: (fragments of what your script might look like):

#!/bin/bash
...
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
...
echo "SLURM_JOB_CPUS_PER_NODE:" $SLURM_JOB_CPUS_PER_NODE
...
module load swset/2018.05 gcc/7.3.0 r/3.6.1
...
Rscript multiple_cpu_test.R $SLURM_JOB_CPUS_PER_NODE
...

R Script: multiple_cpu_test.R

args <- commandArgs(trailingOnly = TRUE)
if (!is.na(args[1])) {
  num_of_cores <- args[1]
  print(paste0("Num of Cores: ", num_of_cores))
}

print(paste0("detectCores: ", parallel::detectCores()))

options(mc.cores = num_of_cores)
print(paste0("mc.cores: ", getOption("mc.cores", 1L)))

Slurm Output:

SLURM_JOB_CPUS_PER_NODE: 8
...
[1] "Num of Cores: 8"
[1] "detectCores: 32"
[1] "mc.cores: 8"

R Packages

Below we will give some guidelines on how to install and use various R packages specifically on Teton.

  • Typically, packages will be installed in your home folder, within the R folder, under the platform version x86_64-pc-linux-gnu-library, then under a major.minor version (without the patch number) folder.

~/R/
  x86_64-pc-linux-gnu-library/
    3.5/
    3.6/
  • Packages installed/built with one major.minor version will typically not work under another.

R Package: RStan

Installing Packages: Potential Problems

Trying to install install.packages("labdsv") resulted in the following error:

/apps/u/gcc/4.8.5/intel/18.0.1-7cbw2rp/include/complex(77): error #308: member "std::complex<float>::_M_value" (declared at line 1187 of "/usr/include/c++/4.8.5/complex") is inaccessible
          _M_value = __z._M_value;
...
compilation aborted for sptree.cpp (code 2)
make: *** [sptree.o] Error 2
ERROR: compilation failed for package ‘Rtsne’
* removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/Rtsne’
ERROR: dependency ‘Rtsne’ is not available for package ‘labdsv’
* removing ‘/pfs/tsfs1/home/salexan5/R/intel/3.6/labdsv’

This appears to be a reasonably common problem:

and is essentially a result of conflicts between compilers when using complex data types with the workaround of disabling the diagnostic error.
To resolve the issue, create and/or update the ~/.R/Makevars file by adding the following lines:


R and Intel/MKL

On Teton we have versions of r (3.6.1/4.0.2) built with the Intel compiler and related MKL (Maths Kernel Library) that follows a request relating to Improving R Performance by installing optimized BLAS/LAPACK libraries.
To use:

[]$ module load r/4.0.2-intel

[]$ R
R version 4.0.2 (2020-06-22) -- "Taking Off Again"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Storage

Matrix products: default
BLAS/LAPACK: /pfs/tsfs1/apps/el7-x86_64/u/intel/18.0.1/intel-mkl/2018.2.199-pti6y2y/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_rt.so
  • The Intel built version is dependent on the following modules core:

    • intel/18.0.1

    • intel-mkl/2018.2.199

    • The module load r/*.*.*-intel line will automatically load these modules for you.

Installing Packages to Use with Intel Version

  • The packages that you have installed for the standard versions of R will not work for the Intel version since they are built with different compilers. This means you will need to re-install the packages that you use.

  • If you potentially want to use both versions then you will need to create a second folder to install the Intel versions into.

  • This has been tested with R.3.6.1 intel version - a similar approach should apply for 4.0

On Teton, R packages are typically installed into:

~/R/
  x86_64-pc-linux-gnu-library/
    3.5/
    3.6/

One way to install the Intel packages is the following:

  • Create a folder ~/R/intel/3.6/

~/R/
  x86_64-pc-linux-gnu-library/
    3.5/
    3.6/
  intel/
    3.6/
  • Use module load r/3.6.1-intel to load the Intel version.

  • After starting R, use .libPaths(c("~/R/intel/3.6/")) to set your environment to use this folder.

    • If you run .libPaths() you should see something of the form:

> .libPaths()
[1] "/pfs/tsfs1/home/salexan5/R/intel/3.6"                          
[2] "/pfs/tsfs1/apps/el7-x86_64/u/opt/R/3.6.1/intel/R-3.6.1/library"
  • Install packages as normal e.g. install.packages("<the package's name>")

  • When running your R scripts you need to set .libPaths(c("~/R/intel/3.6/")) before loading any libraries to inform R where the appropriate packages can be found.

  • Note: Currently R Package: RStan can not be installed using the intel version.

  • No labels