R Environments and Reproducibility

Goal: Introduce ideas and practices to assist in managing the reproducibility of R environments.


What Packages Do I Have Installed?

First step is knowing what your environment is using, and where these packages are installed:

Remember to use .libPaths()

[]$ module load gcc/14.2.0 r/4.4.0 []$ R R version 4.4.0 (2024-04-24) -- "Puppy Cup" ... > .libPaths() [1] "/cluster/medbow/home/<username>/R/x86_64-pc-linux-gnu-library/4.4" [2] "/apps/u/spack/gcc/13.2.0/r/4.4.0-pvzi4gp/rlib/R/library" > quit() []$ ls /apps/u/spack/gcc/13.2.0/r/4.4.0-pvzi4gp/rlib/R/library base compiler datasets graphics grDevices grid methods parallel splines stats stats4 tcltk tools translations utils []$ ls /cluster/medbow/home/<username>/R/x86_64-pc-linux-gnu-library/4.4 class cli DBI e1071 generics KernSmooth magrittr pillar proxy R6 rlang sf stringr tidyr units vctrs wk classInt cpp11 dplyr fansi glue lifecycle MASS pkgconfig purrr Rcpp s2 stringi tibble tidyselect utf8 withr XML

Anything ARCC has installed will not be updated. We will create a new version of the base R.


Track the R Packages and Versions you have Installed

How can I track the versions of R packages installed? Using plain R:

[salexan5@mblog1 ~]$ R R version 4.4.0 (2024-04-24) -- "Puppy Cup" ... > write.table(installed.packages()[,c(1,2,3:4)]) "Package" "LibPath" "Version" "Priority" "class" "class" "/cluster/medbow/home/<username>/R/x86_64-pc-linux-gnu-library/4.4" "7.3-22" "recommended" ... "sf" "sf" "/cluster/medbow/home/<username>/R/x86_64-pc-linux-gnu-library/4.4" "1.0-16" NA "stringi" "stringi" "/cluster/medbow/home/<username>/R/x86_64-pc-linux-gnu-library/4.4" "1.8.4" NA "stringr" "stringr" "/cluster/medbow/home/<username>/R/x86_64-pc-linux-gnu-library/4.4" "1.5.1" NA "tibble" "tibble" "/cluster/medbow/home/<username>/R/x86_64-pc-linux-gnu-library/4.4" "3.2.1" NA ... "tools" "tools" "/apps/u/spack/gcc/14.2.0/r/4.4.0-w7xoohc/rlib/R/library" "4.4.0" "base" "utils" "utils" "/apps/u/spack/gcc/14.2.0/r/4.4.0-w7xoohc/rlib/R/library" "4.4.0" "base"

Conda Export and R Packages

The conda list command (within an activated Conda environment) will only list the packages you’ve installed using conda install.

It does not track/list anything you’ve installed, from within R, using install.packages().

Using conda env export/conda env create create an incomplete environment.

[]$ module purge []$ module load miniconda3/24.3.0 conda activate /cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env (/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env) [salexan5@mblog2 ~]$ (/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env) [salexan5@mblog2 ~]$ conda list # packages in environment at /cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env: # # Name Version Build Channel ... r-base 4.3.3 he2d9a6e_3 conda-forge r-stringi 1.8.4 r43hbd1cc82_0 conda-forge ...
(/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env) []$ conda list # packages in environment at /cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env: # # Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge _r-mutex 1.0.1 anacondar_1 conda-forge binutils_impl_linux-64 2.40 ha1999f0_7 conda-forge bwidget 1.9.14 ha770c72_1 conda-forge bzip2 1.0.8 hd590300_5 conda-forge c-ares 1.28.1 hd590300_0 conda-forge ca-certificates 2024.6.2 hbcca054_0 conda-forge cairo 1.18.0 hbb29018_2 conda-forge curl 8.8.0 he654da7_0 conda-forge expat 2.6.2 h59595ed_0 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 h77eed37_2 conda-forge fontconfig 2.14.2 h14ed4e7_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge freetype 2.12.1 h267a509_2 conda-forge fribidi 1.0.10 h36c2ea0_0 conda-forge gcc_impl_linux-64 13.2.0 h9eb54c0_13 conda-forge gfortran_impl_linux-64 13.2.0 h9efe08d_13 conda-forge graphite2 1.3.13 h59595ed_1003 conda-forge gsl 2.7 he838d99_0 conda-forge gxx_impl_linux-64 13.2.0 h2a599c4_13 conda-forge harfbuzz 8.5.0 hfac3d4d_0 conda-forge icu 73.2 h59595ed_0 conda-forge kernel-headers_linux-64 2.6.32 he073ed8_17 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge krb5 1.21.3 h659f571_0 conda-forge ld_impl_linux-64 2.40 hf3520f5_7 conda-forge lerc 4.0.0 h27087fc_0 conda-forge libblas 3.9.0 22_linux64_openblas conda-forge libcblas 3.9.0 22_linux64_openblas conda-forge libcurl 8.8.0 hca28451_0 conda-forge libdeflate 1.20 hd590300_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libev 4.33 hd590300_2 conda-forge libexpat 2.6.2 h59595ed_0 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-devel_linux-64 13.2.0 hdb50d1a_113 conda-forge libgcc-ng 13.2.0 h77fa898_13 conda-forge libgfortran-ng 13.2.0 h69a702a_13 conda-forge libgfortran5 13.2.0 h3d2ce59_13 conda-forge libglib 2.80.2 h8a4344b_1 conda-forge libgomp 13.2.0 h77fa898_13 conda-forge libiconv 1.17 hd590300_2 conda-forge libjpeg-turbo 3.0.0 hd590300_1 conda-forge liblapack 3.9.0 22_linux64_openblas conda-forge libnghttp2 1.58.0 h47da74e_1 conda-forge libopenblas 0.3.27 pthreads_h413a1c8_0 conda-forge libpng 1.6.43 h2797004_0 conda-forge libsanitizer 13.2.0 h6ddb7a1_13 conda-forge libssh2 1.11.0 h0841786_0 conda-forge libstdcxx-devel_linux-64 13.2.0 hdb50d1a_113 conda-forge libstdcxx-ng 13.2.0 hc0a3c3a_13 conda-forge libtiff 4.6.0 h1dd3fc0_3 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libwebp-base 1.4.0 hd590300_0 conda-forge libxcb 1.16 hd590300_0 conda-forge libzlib 1.3.1 h4ab18f5_1 conda-forge make 4.3 hd18ef5c_1 conda-forge ncurses 6.5 h59595ed_0 conda-forge openssl 3.3.1 h4ab18f5_1 conda-forge pango 1.54.0 h84a9a3c_0 conda-forge pcre2 10.44 h0f59acf_0 conda-forge pixman 0.43.2 h59595ed_0 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge r-base 4.3.3 he2d9a6e_3 conda-forge r-stringi 1.8.4 r43hbd1cc82_0 conda-forge readline 8.2 h8228510_1 conda-forge sed 4.8 he412f7d_0 conda-forge sysroot_linux-64 2.12 he073ed8_17 conda-forge tk 8.6.13 noxft_h4845f30_101 conda-forge tktable 2.10 h8bc8fbc_6 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.1.1 hd590300_0 conda-forge xorg-libsm 1.2.4 h7391055_0 conda-forge xorg-libx11 1.8.9 hb711507_1 conda-forge xorg-libxau 1.0.11 hd590300_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h0b41bf4_2 conda-forge xorg-libxrender 0.9.11 hd590300_0 conda-forge xorg-libxt 1.3.0 hd590300_1 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.6 h166bdaf_0 conda-forge zlib 1.3.1 h4ab18f5_1 conda-forge zstd 1.5.6 ha6fb4c9_0 conda-forge
(/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env) []$ R R version 4.3.3 (2024-02-29) -- "Angel Food Cake" ... > .libPaths() [1] "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" > write.table(installed.packages()[,c(1,2,3:4)])
> write.table(installed.packages()[,c(1,2,3:4)]) "Package" "LibPath" "Version" "Priority" "base" "base" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "base64enc" "base64enc" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "0.1-3" NA "cli" "cli" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "3.6.3" NA "compiler" "compiler" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "crayon" "crayon" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.5.3" NA "datasets" "datasets" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "digest" "digest" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "0.6.36" NA "evaluate" "evaluate" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "0.24.0" NA "fansi" "fansi" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.0.6" NA "fastmap" "fastmap" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.2.0" NA "glue" "glue" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.7.0" NA "graphics" "graphics" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "grDevices" "grDevices" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "grid" "grid" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "htmltools" "htmltools" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "0.5.8.1" NA "IRdisplay" "IRdisplay" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.1" NA "IRkernel" "IRkernel" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.3.2" NA "jsonlite" "jsonlite" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.8.8" NA "lifecycle" "lifecycle" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.0.4" NA "methods" "methods" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "parallel" "parallel" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "pbdZMQ" "pbdZMQ" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "0.3-11" NA "pillar" "pillar" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.9.0" NA "repr" "repr" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.1.7" NA "rlang" "rlang" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.1.4" NA "splines" "splines" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "stats" "stats" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "stats4" "stats4" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "stringi" "stringi" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.8.4" NA "tcltk" "tcltk" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "tools" "tools" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "utf8" "utf8" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.2.4" NA "utils" "utils" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "4.3.3" "base" "uuid" "uuid" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "1.2-0" NA "vctrs" "vctrs" "/cluster/medbow/project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library" "0.6.5" NA
() []$ ls /project/<project-name>/software/conda-envs/r_4.3.3_env/lib/R/library/ base compiler digest fastmap grDevices IRdisplay lifecycle pbdZMQ rlang stats4 tools utils base64enc crayon evaluate glue grid IRkernel methods pillar splines stringi translations uuid cli datasets fansi graphics htmltools jsonlite parallel repr stats tcltk utf8 vctrs

Track the Building of Your Environments

You will need to use a combination of:

  • System: module load r/<version>

  • R: .libPaths()

  • R: install.packages()

  • Conda: conda list/conda env export/conda env create

To record and track how your environment is made up.

Be aware that updating a package might update all it’s dependencies.

The order you install packages might also make a difference.


Install R Packages with a Specific Version

R’s base install.packages() only allows you to install a specific version of a package when you’ve downloaded the source.

The conda install does allow you to define a specific version.

There are a number of R packages to assist you:

  • remotes: R Package Installation from Remote Repositories, Including 'GitHub'

  • versions: Query and Install Specific Versions of Packages on CRAN

  • devtools: Tools to Make Developing R Packages Easier


Suggested Best Practices

  • For specific projects/research focuses, create specific libraries and or conda environments (with everything installed within that conda environment) to localize used packages/versions.

  • Regularly track/update what packages you’re using (install.packages / conda install r-<package-name>) and their versions.

  • Be mindful of dependencies that a package additional installs.

  • Be mindful when prompted whether you want to update dependencies or not.

  • Avoid trying to have a behemoth of a single environment - consider have a number of independent environments/libraries that can be more easily managed and shared along a workflow/pipeline.