Goals:
Provide logic and reasoning for using Conda environments, versus other available options
Walk through the process for creating personalized kernels from a Conda environment
This training is available for download as a PDF here.
Table of Contents | ||
---|---|---|
|
Why use Conda?
Also: |
---|
pip
installs everything to one central consolidated location. Changes for one set of packages or software can unintentionally affect others.
Why make your own kernels? Why have multiple kernels?
Individual researchers on ARCC systems often need access to specific software and modules - specific to their focus of study or research.
While ARCC provides general “base” kernels, they typically lack all software packages specific to any one researcher’s needs. To solve this, researchers may perform installs within the jupyter notebook (with !pip
for Python or install.packages()
for R).
ARCC recommends against this practice because by default this will install the packages within the user’s $HOME creating a new set of issues in the future.
Issue 1: Storage Quotas
$HOME
directories on HPC are usually relatively small compared to storage in/project
or/gscratch
. ARCC provides a default quotas on HPC for each user’s$HOME
, but this can fill up quickly when performing frequent pythonpip
or Rinstall.package()
installations, which leads to exceeding the storage quota in your$HOME
directory.While installation locations can be redirected, several other configuration changes should be made to make them available and work appropriately.
It’s far more straightforward to create Conda environments and redirect Conda installations with the
-p
flag (to specify installation path).
Issue 2: Software Conflicts
By default, local installs will install to one central location within your home directory (
pip
under~/.local
,install.packages()
under~/R
)Package installations are separated by kernel software versions, but if conflicts exist within the overall install location, packages are overwritten to make the software and dependencies “fit” with your most recently requested installation. This can change installation and available packages for your user profile, breaking older installations and software that you still want to use.
Installing in your
$HOME
directory makes packages and versions of software in your /home available to you regardless of whether you want them at the time, or not.This causes software conflicts between versions native to the HPC system, those you may want to load off and on with modules, and those you’ve installed in
$HOME
, meaning that loaded modules or native HPC software may not work properly or crash due to underlying dependencies can be superseded by packages in$HOME
.
Dependency Hell is not Limited to Python: Examples in R
Also: |
---|
Kernels from Conda Environments
In the
In OnDemand Jupyter sessions you can launch a Jupyter session utilizing the software packages from a Conda environment you’ve set up yourself.
This becomes very useful if you are using a package that requires extensions be installed in the environment that is launching the Jupyter session.
To configure your environment so that it launches as you want, you should ensure that the appropriate packages are installed within the Conda environment.
In this module, we will go through steps needed to correctly create your environment and configure it as a kernel available to you within your Jupyter sessions.
Exporting your Conda Environment to a Kernel (Python)
In these steps, we assume you’ve already created your desired Conda environment. To learn how to create a Conda environment, please see our Conda module.
You should also return to the documentation on exporting conda environments to a python kernel if you have not completed this training.
With our Conda environment already installed and configured, we can now set it up to be used as a jupyter kernel. (To learn about how to make your own Conda environments, see our training on Conda)
| For Python:
|
---|
Exporting your Conda Environment to a Kernel (R)
In these steps, we assume you’ve already created your desired Conda environment. To learn how to create a Conda environment, please see our Conda module.
Note: A more in depth description with background for creating an R kernel is available at in our R and RStudio Training Materials in the section re: Create an R Kernel for Jupyter
With our Conda environment already installed and configured, we can now set it up to be used as a jupyter kernel. (To learn about how to make your own Conda environments, see our training on Conda)
| For R:
|
---|
Running a Console Kernel
Open a new Jupyter Session
Select your new kernel from the dropdown list
Next Steps
Previous | Workshop Home |
Use the following link to provide feedback on this training: https://forms.gle/48Cy6mmY1gniN9Jy5 or use the QR code below.