RStan

These instructions were for an older Teton cluster, in principle the process should work for Beartooth and other clusters - but with updated module loads.

Overview

RStan: The R interface to Stan. It is distributed on CRAN as the RStan package and its source code is hosted on GitHub.

  • CRAN: RStan: User-facing R functions are provided to parse, compile, test, estimate, and analyze Stan models by accessing the header-only Stan library provided by the 'StanHeaders' package. The Stan project develops a probabilistic programming language that implements full Bayesian statistical inference via Markov Chain Monte Carlo, rough Bayesian inference via 'variational' approximation, and (optionally penalized) maximum likelihood estimation via optimization. In all three cases, automatic differentiation is used to quickly and accurately evaluate gradients without burdening the user with the need to derive the partial derivatives.

  • This has been built using r/4.0.2r/3.6.1 (which has both been built with gcc/7.3.0). Please check that before trying to install rstan that you have gcc/7.3.0 loaded.

R with RStan Pre-Installed

ARCC is moving towards providing versions of R with a number of packages pre-installed. Version 4.0.5 is currently built with rstan 2.21.2 already installed for you.

[]$ module load gcc/7.3.0 r/4.0.5-py27 []$ R R version 4.0.5 (2021-03-31) -- "Shake and Throw" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) ... > library(rstan) Loading required package: StanHeaders Loading required package: ggplot2 rstan (Version 2.21.2, GitRev: 2e1f913d3ca3) For execution on a local, multicore CPU with excess RAM we recommend calling options(mc.cores = parallel::detectCores()). To avoid recompilation of unchanged Stan programs, we recommend calling rstan_options(auto_write = TRUE)

Installing RStan Locally

Installing rstan version with V8 dependency

The following details how rstan can be setup on teton with versions that have V8 as a dependency.

Note: This example used rstan Version: 2.21.2, and was built using R 4.0.2 (built using the gcc compiler).

[]$ module load gcc/7.3.0 r/4.0.2-py27 []$ R R version 4.0.2 (2020-06-22) -- "Taking Off Again" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) ... > Sys.setenv(DOWNLOAD_STATIC_LIBV8=1) > install.packages(c("V8", "rstan"), dependencies=TRUE) **************************************************************** NOTE: rstan has LOTS of dependencies and depending what you already do/don't have installed this could take an hour. ‘colorspace’, ‘utf8’, ‘ps’, ‘nlme’, ‘farver’, ‘labeling’, ‘lifecycle’, ‘munsell’, ‘RColorBrewer’, ‘viridisLite’, ‘ellipsis’, ‘fansi’, ‘magrittr’, ‘pillar’, ‘pkgconfig’, ‘vctrs’, ‘backports’, ‘processx’, ‘assertthat’, ‘lattice’, ‘digest’, ‘glue’, ‘gtable’, ‘isoband’, ‘MASS’, ‘mgcv’, ‘rlang’, ‘scales’, ‘tibble’, ‘checkmate’, ‘matrixStats’, ‘callr’, ‘cli’, ‘crayon’, ‘desc’, ‘prettyunits’, ‘R6’, ‘rprojroot’, ‘Matrix’, ‘StanHeaders’, ‘ggplot2’, ‘inline’, ‘gridExtra’, ‘RcppParallel’, ‘loo’, ‘pkgbuild’, ‘withr’, ‘RcppEigen’, ‘BH’ **************************************************************** > library(rstan) Loading required package: StanHeaders Loading required package: ggplot2 code for methods in class “Rcpp_model_base” was not checked for suspicious field assignments (recommended package ‘codetools’ not available?) code for methods in class “Rcpp_model_base” was not checked for suspicious field assignments (recommended package ‘codetools’ not available?) code for methods in class “Rcpp_stan_fit” was not checked for suspicious field assignments (recommended package ‘codetools’ not available?) code for methods in class “Rcpp_stan_fit” was not checked for suspicious field assignments (recommended package ‘codetools’ not available?) rstan (Version 2.21.2, GitRev: 2e1f913d3ca3) For execution on a local, multicore CPU with excess RAM we recommend calling options(mc.cores = parallel::detectCores()). To avoid recompilation of unchanged Stan programs, we recommend calling rstan_options(auto_write = TRUE)

Installing rstan version before V8 dependency

The following details how rstan could be setup on teton prior to having V8 as a dependency.

Note: This example used rstan Version: 2.19.3, and was built using R 3.6.1

Due to the C++ toolchain on Linux required for installing RStan, you will need to configure your home environment before trying to install it.
In your home folder create the .R folder and then navigate into it:

[]$ cd ~ []$ mkdir .R []$ cd .R

Create the Makevars file: You can use whichever editor you are comfortable with - below we're using vim:

Type into the file the required configuration, and then save the file:

Load r/3.6.1 module, start R and install RStan:

Pre-Built Modules on Teton

We currently have two (older) versions of rstan pre-built as modules: module spider rstan

Since these are modules, you are forced to use the version of R that they were built with.

Which were built for R/3.5.2 and thus will not work with 3.6.1 or later versions of R.

r-rstan/2.17.2-py27

r-rstan/2.18.2-py27

Installing RStan with the Intel Compile

Although we currently have a version of R/3.6.1 built using the intel 18.0.1 compiler, we are unable to build a version of rstan for this variant.

Teton: Using Multiple CPUs

When loading the rstan package, you'll see:

Following on from this, it appears that out-of-the-box rstan can use multiple cores when you look at the stan function and it's cores argument: Fit a model with Stan
The use of parallel::detectCores() is a slight red herring. This returns the entire total number of cores of the node your job is allocated and not the actual number of cores you requested/allocated.

Please read the section Using Multiple CPUs on the R on Teton page on how best to request multiple cores in a job and how to pass this number into your R script.

And then how then to pass this into stan as an option: