Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

Goals:

  • Provide logic and reasoning for using Conda environments, versus other available options

  • Walk through the process for creating personalized kernels from a Conda environment

This training is available for download as a PDF here.



Why do I need to use Conda?

pip installs everything to one central consolidated location. Changes for one set of packages or software can unintentionally affect others.


Why make your own kernels? Why have multiple kernels?

Individual researchers on ARCC systems often need access to specific software and modules - specific to their focus of study or research.

While ARCC provides general “base” kernels, they typically lack all software packages specific to any one researcher’s needs. To solve this, researchers may perform installs within the jupyter notebook (with !pip for Python or install.packages() for R).

ARCC recommends against this practice because by default this will install the packages within the user’s $HOME creating a new set of issues in the future.


Issue 1: Storage Quotas

  1. $HOME directories on HPC are usually relatively small compared to storage in /project or /gscratch. ARCC provides a default quotas on HPC for each user’s $HOME, but this can fill up quickly when performing frequent python pip or R install.package() installations, which leads to exceeding the storage quota in your $HOME directory.

    1. While installation locations can be redirected, several other configuration changes should be made to make them available and work appropriately.

    2. It’s far more straightforward to create Conda environments and redirect Conda installations with the -p flag (to specify installation path).


Issue 2: Software Conflicts

  1. By default, local installs will install to one central location within your home directory (pip under ~/.local, install.packages() under ~/R)

    1. Package installations are separated by kernel software versions, but if conflicts exist within the overall install location, packages are overwritten to make the software and dependencies “fit” with your most recently requested installation. This can change installation and available packages for your user profile, breaking older installations and software that you still want to use.

    2. Installing in your $HOME directory makes packages and versions of software in your /home available to you regardless of whether you want them at the time, or not.

    3. This causes software conflicts between versions native to the HPC system, those you may want to load off and on with modules, and those you’ve installed in $HOME, meaning that loaded modules or native HPC software may not work properly or crash due to underlying dependencies can be superseded by packages in $HOME.


Kernels from Conda Environments

In the Conda module, we learned that conda allows software environments to be contained, meaning they do not conflict with one another, and can be loaded and unloaded so that they are exposed to you only when you want them to be.

In OnDemand Jupyter sessions you can launch a Jupyter session utilizing the software packages from a Conda environment you’ve set up yourself.

  • This becomes very useful if you are using a package that requires extensions be installed in the environment that is launching the Jupyter session.

  • To configure your environment so that it launches as you want, you should ensure that the appropriate packages are installed within the Conda environment.

  • In this module, we will go through steps needed to correctly create your environment and configure it as a kernel available to you within your Jupyter sessions.


Exporting your Conda Environment to a Kernel (Python)

In these steps, we assume you’ve already created your desired Conda environment. To learn how to create a Conda environment, please see our Conda module.

With our Conda environment already installed and configured, we can now set it up to be used as a jupyter kernel. (To learn about how to make your own Conda environments, see our training on Conda)

  1. Open a Command Line Terminal on the HPC resource.

  2. Load Miniconda

  3. Activate your Conda environment

  4. Install your kernel:
    conda install ipykernel for python kernels

  5. Set your environment to be recognized as an available kernel
    python -m ipykernel install --user --name=<kernelname>

For Python:

#open command line interface
module load miniconda3
conda activate <insert_environment_name_or_path_here>
conda install ipykernel
python -m ipykernel install --user --name=mypythonkernel


Exporting your Conda Environment to a Kernel (R)

In these steps, we assume you’ve already created your desired Conda environment. To learn how to create a Conda environment, please see our Conda module.

With our Conda environment already installed and configured, we can now set it up to be used as a jupyter kernel. (To learn about how to make your own Conda environments, see our training on Conda)

  1. Open a Command Line Terminal on the HPC resource.

  2. Load Miniconda

  3. Activate your Conda environment

  4. Install your kernel:
    conda install r-irkernel for R kernels

  5. Set your environment to be recognized as an available kernel

    1. note R location in conda

    2. Load R: R

    3. install kernel: R prompt:
      > IRkernel::installspec(name,displayname)

    4. Create specific R version kernel folder in ~/.local/share/jupyter/kernels/

    5. Copy config from <conda-env> appending the following to path after <conda-env> name with /lib/R/library/IRkernel/kernelspec/* to kernels folder in your .local/share/jupyter/kernels/ folder.

For R:

#Step (1) open command line interface:
#Step (2) Load miniconda:
module load miniconda3
#Step (3) : 
conda activate <insert_environment_name_or_path_here>
#Step (4):
conda install r-irkernel
#Step (5a):
which R
/project/<project_name>/software/conda/r/rtest/bin/R
#Launch R (step 5b):
R
#Install kernel in R prompt (step 5biii):
> install.packages('IRkernel') 
> IRkernel::installspec(name='r4.4.1', displayname='R 4.4.1 Kernel')
> q()
#Step (5c): 
#Change directory to jupyter kernel directory:
cd ~/.local/share/jupyter/kernels/
#Step (5d): make a directory for your kernel:
mkdir r4.4.1
#change directory to new directory:
cd r4.4.1
#Step (5e) Copy kernelspec from conda to .local:
cp /project/<project_name>/software/conda/r/rtest/lib/R/library/IRkernel/kernelspec/* ~/.local/share/jupyter/kernels/r4.4.1/.

Running a Console Kernel

Open a new Jupyter Session

Select your new kernel from the dropdown list


Next Steps

Use the following link to provide feedback on this training: https://forms.gle/qBBwXpKeTNqSR5516 or use the QR code below.

  • No labels