Example 01: Python, Numpy and Pandas Environment
Goal: Demonstrate a basic conda environment creation workflow by creating a Python environment that contains the numpy and pandas packages.
General Process
Below demonstrates the general process for:
Setting up your environment to use Conda via a module load.
How to create a conda environment.
How to activate/start your environment and install packages.
How to then use what you’ve installed.
And then how to deactivate/finish using your environment.
[]$ module purge
[]$ module load miniconda3/24.3.0
[]$ conda search python
[]$ conda create -n py_env
[]$ conda activate py_env
(py_env) []$ conda install python=3.12.4
(py_env) []$ python --version
Python 3.12.4
(py_env) []$ conda search numpy
(py_env) []$ conda install numpy
(py_env) []$ python -c "import numpy; print(numpy.__version__)"
1.26.4
(py_env) []$ conda deactivate
[]$
Search for Packages
Conda enables you look up and search for packages and what versions are available.
[]$ conda search python
Loading channels: done
# Name Version Build Channel
python 2.7.13 hac47a24_15 pkgs/main
...
python 3.12.3 h996f2a0_1 pkgs/main
python 3.12.4 h5148396_1 pkgs/main
Notice: Although Python version 2 has been deprecated, it is still used for old packages/modules/scripts. You can create a conda environment that provides this old, no longer supported version.
This is how ARCC provides this version on the cluster.
Create an Environment
Use the create sub command to create your environment.
The -n
option will create your environment (which is just a folder) in the default configured location.
Out the box, this will be under: /home/<username>/.conda/envs/
[]$ conda create -n py_env
Channels:
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /home/salexan5/.conda/envs/py_env
Proceed ([y]/n)? y
Note the location that the environment will be saved to: environment location: /home/salexan5/.conda/envs/py_env
Create an Environment: Proceed
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate py_env
#
# To deactivate an active environment, use
#
# $ conda deactivate
Note the command required to activate your environment: conda activate py_env
This will be required every time you wish to use it.
Activate an Environment
Before you can use a conda environment it must be activated.
Every time you want to use it, you must remember to activate it.
[]$ conda activate py_env
(py_env) []$
Note: See how the command line prompt has changed once the environment has been activated: (py_env) [...]
This indicates the name of the conda environment that is currently active.
Conda Install a Version of Python
Once a conda environment is active, you can start to use it, and one aspect of this is to install conda packages.
(py_env) []$ conda install python=3.12.4
Note all the addition dependencies/libraries that are being installed into the environment
Lets check the version installed, within are active environment.
(py_env) []$ python --version
Python 3.12.4
Conda Install the numpy Package
Let’s first check if numpy
is available as a package, and if it is, which versions are available.
(py_env) []$ conda search numpy
Loading channels: done
# Name Version Build Channel
numpy 1.9.3 py27_nomklhbee5d10_3 pkgs/main
...
numpy 1.26.4 py39heeff2f4_0 pkgs/main
(py_env) []$ conda install numpy
(Again) Note: See all the addition dependencies/libraries that are being installed into the environment
If no package version is defined, typically it’ll install the latest version.
Lets check the version installed, within are active environment.
(py_env) []$ python -c "import numpy; print(numpy.__version__)"
1.26.4
Packages are being updated on a frequent basis.
When this workshop was created
numpy/1.26.4
was the latest version.As of 20240815, version
2.0.1
is available, and in a few weeks a newer version will likely be available.If you do not define a version, then the latest will be installed.
Conda Deactivate your Environment
(py_env) []$ conda deactivate
[]$
Note how the command line prompt has changed, reverting back to before being activated: [...]
The conda environment is no longer activate and can not be used.
[]$ python --version
Python 3.12.2
[]$ python -c "import numpy; print(numpy.__version__)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'numpy'
(Re-)Using the Environment
To re-use a conda environment, you must activate it again.
But, you do not have to re-create it. Creation only happens once.
[]$ module purge
[]$ module load miniconda3/24.3.0
[]$ conda activate py_env
(py_env) []$ python --version
Python 3.12.4
(py_env) []$ python -c "import numpy; print(numpy.__version__)"
1.26.4
(py_env) []$ conda deactivate
[]$
Note: The conda environment must be activated to use it.
Adding to an Existing Environment
Although we only need to create a conda environment once, we can add/update it every time we use it.
Let’s go back into are existing conda environment and add the pandas
package.
[]$ module purge
[]$ module load miniconda3/24.3.0
[]$ conda activate py_env
(py_env) []$ conda install pandas
(py_env) []$ python py_test.py
Python: 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:12:24) [GCC 11.2.0]
Numpy: 1.26.4
Pandas: 2.2.2
(py_env) []$ conda deactivate