What is Conda? Using Miniconda3 on the Cluster

Goal: Introduce conda, define some terminology, how to use on the cluster and finding help.


What is Conda?

  • Getting Started with Conda: A powerful command line tool for package and environment management that runs on Windows, macOS, and Linux.

  • Conda Documentation: Provides package, dependency, and environment management for any language.


Getting Conda: Miniconda vs Anaconda vs Miniforge

Miniconda: (<0.5G) A free minimal installer for conda. It is a small bootstrap version of Anaconda that includes only conda, Python, the packages (<70) they both depend on, and a small number of other useful packages (like pip, zlib, and a few others).

  • This is what ARCC provides.

Anaconda: (4.4G) (The Company) Anaconda® Distribution is a free Python/R data science distribution that contains:

  • Conda and Anaconda Navigator (a desktop GUI application built on conda, with options to launch other development applications from your managed environments).

  • 250 automatically-installed packages and access to the Anaconda Public Repository (>8K open-source data science and ML packages).

Miniforge: This community driven repository holds the minimal installers for Conda and Mamba (a reimplementation of the conda package manager in C++) specific to conda-forge (Community-led recipes, infrastructure and distributions for conda - the default and only channel.

Should I use Anaconda Distribution or Miniconda?


Terminology

Glossary:

Term

Definition

Term

Definition

package manager

A collection of software tools that automates the process of installing, updating, configuring, and removing computer programs for a computer's operating system.

conda

Conda is a package manager. The package and environment manager program … that installs and updates conda packages and their dependencies.

conda package

A compressed file that contains everything that a software program needs in order to be installed and run, so that you do not have to manually find and install each dependency separately.

conda environment

A folder or directory that contains a specific collection of conda packages and their dependencies, so they can be maintained and run separately without interference from each other.

conda repository

A cloud-based repository that contains packages that are easily installed.

channels

The locations of the repositories where conda looks for packages.


Dependency Hell

The concept of dependency hell was introduced in the Module System workshop and rears its ugly head when trying to set up Python and/or R environments with a lot of packages.


Using Miniconda3/Conda on the Cluster

[]$ module spider miniconda3 ---------------------------------------------------------------------------- miniconda3: miniconda3/24.3.0 ---------------------------------------------------------------------------- You will need to load all module(s) on any one of the lines below before the "miniconda3/24.3.0" module is available to load. arcc/1.0 Help: The minimalist bootstrap toolset for conda and Python3.

We update miniconda3 on a semi-frequent basis.


Conda Version and Help

[]$ module purge []$ module load miniconda3/24.3.0 []$ conda --version conda 24.3.0 []$ conda --help usage: conda [-h] [-v] [--no-plugins] [-V] COMMAND ... conda is a tool for managing and deploying applications, environments and packages. ... [salexan5@mblog1 ~]$ conda install --help usage: conda install [-h] [--revision REVISION] [-n ENVIRONMENT | -p PATH] [-c CHANNEL] [--use-local] [--override-channels] [--repodata-fn REPODATA_FNS] [--experimental {jlap,lock}] [--no-lock] [--repodata-use-zst | --no-repodata-use-zst] [--strict-channel-priority] [--no-channel-priority] [--no-deps | --only-deps] [--no-pin] [--copy] [--no-shortcuts] [--shortcuts-only SHORTCUTS_ONLY] [-C] [-k] [--offline] [--json] [-v] [-q] [-d] [-y] [--download-only] [--show-channel-urls] [--file FILE] [--solver {classic,libmamba}] [--force-reinstall] [--freeze-installed | --update-deps | -S | --update-all | --update-specs] [-m] [--clobber] [--dev] [package_spec ...] ...
[]$ conda --help usage: conda [-h] [-v] [--no-plugins] [-V] COMMAND ... conda is a tool for managing and deploying applications, environments and packages. options: -h, --help Show this help message and exit. -v, --verbose Can be used multiple times. Once for detailed output, twice for INFO logging, thrice for DEBUG logging, four times for TRACE logging. --no-plugins Disable all plugins that are not built into conda. -V, --version Show the conda version number and exit. commands: The following built-in and plugins subcommands are available. COMMAND activate Activate a conda environment. clean Remove unused packages and caches. compare Compare packages between conda environments. config Modify configuration values in .condarc. content-trust Signing and verification tools for Conda create Create a new conda environment from a list of specified packages. deactivate Deactivate the current active conda environment. doctor Display a health report for your environment. export Export a given environment info Display information about current conda install. init Initialize conda for shell interaction. install Install a list of packages into a specified conda environment. list List installed packages in a conda environment. notices Retrieve latest channel notifications. package Create low-level conda packages. (EXPERIMENTAL) remove (uninstall) Remove a list of packages from a specified conda environment. rename Rename an existing environment. repoquery Advanced search for repodata. run Run an executable in a conda environment. search Search for packages and display associated information using the MatchSpec format. update (upgrade) Update conda packages to the latest compatible version.