Using R and RStudio on the Cluster

Introduction: This workshop will discuss how the various R related tools and RStudio work together on the cluster and introduce a series of best practices for managing these environments.

Course Goals:

  • Using R from the command line.

  • Refresh on creating ‘R’ conda environments.

  • Where are libraries installed?

  • Using RStudio via OnDemand.

    • Using a Conda environment within RStudio

  • Introduction aspects of using R in parallel on the cluster.


This is not a workshop on learning the R language, but on how to use R on the cluster.

Notes:

  • The workshop modules work best in a sequential manner as a story introducing concepts and providing examples, but sections can be used separately to focus on a particular concept.

  • We have tried to make the examples as generic as possible. You will need to replace <project-name> and <username> with appropriate values that apply to you.

  • This tutorial is available for download as a PDF here.


  1. Where are R Packages Installed on the Cluster? Understand where R installs packages and where libraries are located, as well as inspecting general R system configuration.

  2. R Conda Environments and Installed Packages: Understand R environments build with Conda.

  3. R Packages and System Modules: Installing some R packages requires understanding what libraries are available on the System.

  4. Creating a Shared Library of R Packages: Demonstrate how to use an R library to create a shared set of R packages.

  5. Using R and RStudio within OnDemand: Detail the process of using R and RStudio via the OnDemand service.

  6. Using an R Conda Environment with RStudio: Detail how to use an R Conda Environment within RStudio.

  7. Create an R Kernel for a Jupyter Notebook: Detail how to update an R Conda environment so it can be used as a kernel within ARCC’s Jupyter service.

  8. R Environments and Reproducibility: Introduce ideas and practices to assist in managing the reproducibility of R environments.

  9. Parallel R: Introduction: Introduction some high-level aspects of using R in parallel relating to the cluster.

  10. Using R/RStudio on the Cluster: Summary: Summarize the concepts covered across the workshop.