/
Example 01: Python, Numpy and Pandas Environment

Example 01: Python, Numpy and Pandas Environment

Goal: Demonstrate a basic conda environment creation workflow by creating a Python environment that contains the numpy and pandas packages.


General Process

Below demonstrates the general process for:

  • Setting up your environment to use Conda via a module load.

  • How to create a conda environment.

  • How to activate/start your environment and install packages.

  • How to then use what you’ve installed.

  • And then how to deactivate/finish using your environment.

[]$ module purge []$ module load miniconda3/24.3.0 []$ conda search python []$ conda create -n py_env []$ conda activate py_env (py_env) []$ conda install python=3.12.4 (py_env) []$ python --version Python 3.12.4 (py_env) []$ conda search numpy (py_env) []$ conda install numpy (py_env) []$ python -c "import numpy; print(numpy.__version__)" 1.26.4 (py_env) []$ conda deactivate []$

Search for Packages

Conda enables you look up and search for packages and what versions are available.

[]$ conda search python Loading channels: done # Name Version Build Channel python 2.7.13 hac47a24_15 pkgs/main ... python 3.12.3 h996f2a0_1 pkgs/main python 3.12.4 h5148396_1 pkgs/main

Notice: Although Python version 2 has been deprecated, it is still used for old packages/modules/scripts. You can create a conda environment that provides this old, no longer supported version.

This is how ARCC provides this version on the cluster.


Create an Environment

Use the create sub command to create your environment.

The -n option will create your environment (which is just a folder) in the default configured location.

Out the box, this will be under: /home/<username>/.conda/envs/

[]$ conda create -n py_env Channels: - defaults Platform: linux-64 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/salexan5/.conda/envs/py_env Proceed ([y]/n)? y

Note the location that the environment will be saved to: environment location: /home/salexan5/.conda/envs/py_env


Create an Environment: Proceed

Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate py_env # # To deactivate an active environment, use # # $ conda deactivate

Note the command required to activate your environment: conda activate py_env

This will be required every time you wish to use it.


Activate an Environment

Before you can use a conda environment it must be activated.

Every time you want to use it, you must remember to activate it.

[]$ conda activate py_env (py_env) []$

Note: See how the command line prompt has changed once the environment has been activated: (py_env) [...]

This indicates the name of the conda environment that is currently active.


Conda Install a Version of Python

Once a conda environment is active, you can start to use it, and one aspect of this is to install conda packages.

(py_env) []$ conda install python=3.12.4
(py_env) []$ conda install python=3.12.4 Channels: - defaults Platform: linux-64 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/salexan5/.conda/envs/py_env added / updated specs: - python=3.12.4 The following packages will be downloaded: package | build ---------------------------|----------------- pip-24.0 | py312h06a4308_0 3.3 MB setuptools-69.5.1 | py312h06a4308_0 1.3 MB wheel-0.43.0 | py312h06a4308_0 142 KB ------------------------------------------------------------ Total: 4.7 MB The following NEW packages will be INSTALLED: _libgcc_mutex pkgs/main/linux-64::_libgcc_mutex-0.1-main _openmp_mutex pkgs/main/linux-64::_openmp_mutex-5.1-1_gnu bzip2 pkgs/main/linux-64::bzip2-1.0.8-h5eee18b_6 ca-certificates pkgs/main/linux-64::ca-certificates-2024.3.11-h06a4308_0 expat pkgs/main/linux-64::expat-2.6.2-h6a678d5_0 ld_impl_linux-64 pkgs/main/linux-64::ld_impl_linux-64-2.38-h1181459_1 libffi pkgs/main/linux-64::libffi-3.4.4-h6a678d5_1 libgcc-ng pkgs/main/linux-64::libgcc-ng-11.2.0-h1234567_1 libgomp pkgs/main/linux-64::libgomp-11.2.0-h1234567_1 libstdcxx-ng pkgs/main/linux-64::libstdcxx-ng-11.2.0-h1234567_1 libuuid pkgs/main/linux-64::libuuid-1.41.5-h5eee18b_0 ncurses pkgs/main/linux-64::ncurses-6.4-h6a678d5_0 openssl pkgs/main/linux-64::openssl-3.0.14-h5eee18b_0 pip pkgs/main/linux-64::pip-24.0-py312h06a4308_0 python pkgs/main/linux-64::python-3.12.4-h5148396_1 readline pkgs/main/linux-64::readline-8.2-h5eee18b_0 setuptools pkgs/main/linux-64::setuptools-69.5.1-py312h06a4308_0 sqlite pkgs/main/linux-64::sqlite-3.45.3-h5eee18b_0 tk pkgs/main/linux-64::tk-8.6.14-h39e8969_0 tzdata pkgs/main/noarch::tzdata-2024a-h04d1e81_0 wheel pkgs/main/linux-64::wheel-0.43.0-py312h06a4308_0 xz pkgs/main/linux-64::xz-5.4.6-h5eee18b_1 zlib pkgs/main/linux-64::zlib-1.2.13-h5eee18b_1 Proceed ([y]/n)? y Downloading and Extracting Packages: Preparing transaction: done Verifying transaction: done Executing transaction: done

Note all the addition dependencies/libraries that are being installed into the environment

Lets check the version installed, within are active environment.

(py_env) []$ python --version Python 3.12.4

Conda Install the numpy Package

Let’s first check if numpy is available as a package, and if it is, which versions are available.

(py_env) []$ conda search numpy Loading channels: done # Name Version Build Channel numpy 1.9.3 py27_nomklhbee5d10_3 pkgs/main ... numpy 1.26.4 py39heeff2f4_0 pkgs/main

 

(py_env) []$ conda install numpy
(py_env) []$ conda install numpy Channels: - defaults Platform: linux-64 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/salexan5/.conda/envs/py_env added / updated specs: - numpy The following packages will be downloaded: package | build ---------------------------|----------------- mkl-service-2.4.0 | py312h5eee18b_1 66 KB mkl_fft-1.3.8 | py312h5eee18b_0 204 KB mkl_random-1.2.4 | py312hdb19cb5_0 284 KB numpy-1.26.4 | py312hc5e2394_0 11 KB numpy-base-1.26.4 | py312h0da6c21_0 7.7 MB ------------------------------------------------------------ Total: 8.2 MB The following NEW packages will be INSTALLED: blas pkgs/main/linux-64::blas-1.0-mkl intel-openmp pkgs/main/linux-64::intel-openmp-2023.1.0-hdb19cb5_46306 mkl pkgs/main/linux-64::mkl-2023.1.0-h213fc3f_46344 mkl-service pkgs/main/linux-64::mkl-service-2.4.0-py312h5eee18b_1 mkl_fft pkgs/main/linux-64::mkl_fft-1.3.8-py312h5eee18b_0 mkl_random pkgs/main/linux-64::mkl_random-1.2.4-py312hdb19cb5_0 numpy pkgs/main/linux-64::numpy-1.26.4-py312hc5e2394_0 numpy-base pkgs/main/linux-64::numpy-base-1.26.4-py312h0da6c21_0 tbb pkgs/main/linux-64::tbb-2021.8.0-hdb19cb5_0 Proceed ([y]/n)? y Downloading and Extracting Packages: Preparing transaction: done Verifying transaction: done Executing transaction: done

(Again) Note: See all the addition dependencies/libraries that are being installed into the environment

If no package version is defined, typically it’ll install the latest version.

Lets check the version installed, within are active environment.

(py_env) []$ python -c "import numpy; print(numpy.__version__)" 1.26.4
  • Packages are being updated on a frequent basis.

  • When this workshop was created numpy/1.26.4 was the latest version.

  • As of 20240815, version 2.0.1 is available, and in a few weeks a newer version will likely be available.

  • If you do not define a version, then the latest will be installed.


Conda Deactivate your Environment

(py_env) []$ conda deactivate []$

Note how the command line prompt has changed, reverting back to before being activated: [...]

The conda environment is no longer activate and can not be used.

[]$ python --version Python 3.12.2 []$ python -c "import numpy; print(numpy.__version__)" Traceback (most recent call last): File "<string>", line 1, in <module> ModuleNotFoundError: No module named 'numpy'

(Re-)Using the Environment

To re-use a conda environment, you must activate it again.

But, you do not have to re-create it. Creation only happens once.

[]$ module purge []$ module load miniconda3/24.3.0 []$ conda activate py_env (py_env) []$ python --version Python 3.12.4 (py_env) []$ python -c "import numpy; print(numpy.__version__)" 1.26.4 (py_env) []$ conda deactivate []$

Note: The conda environment must be activated to use it.


Adding to an Existing Environment

Although we only need to create a conda environment once, we can add/update it every time we use it.

Let’s go back into are existing conda environment and add the pandas package.

[]$ module purge []$ module load miniconda3/24.3.0 []$ conda activate py_env (py_env) []$ conda install pandas (py_env) []$ python py_test.py Python: 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 15:12:24) [GCC 11.2.0] Numpy: 1.26.4 Pandas: 2.2.2 (py_env) []$ conda deactivate
import sys import numpy import pandas print("Python: " + str(sys.version)) print("Numpy: " + str(numpy.__version__)) print("Pandas: " + str(pandas.__version__))
(py_env) []$ conda install pandas Channels: - defaults Platform: linux-64 Collecting package metadata (repodata.json): done Solving environment: done ## Package Plan ## environment location: /home/salexan5/.conda/envs/py_env added / updated specs: - pandas The following packages will be downloaded: package | build ---------------------------|----------------- bottleneck-1.3.7 | py312ha883a20_0 140 KB numexpr-2.8.7 | py312hf827012_0 149 KB pandas-2.2.2 | py312h526ad5a_0 15.4 MB python-dateutil-2.9.0post0 | py312h06a4308_2 318 KB pytz-2024.1 | py312h06a4308_0 220 KB ------------------------------------------------------------ Total: 16.2 MB The following NEW packages will be INSTALLED: bottleneck pkgs/main/linux-64::bottleneck-1.3.7-py312ha883a20_0 numexpr pkgs/main/linux-64::numexpr-2.8.7-py312hf827012_0 pandas pkgs/main/linux-64::pandas-2.2.2-py312h526ad5a_0 python-dateutil pkgs/main/linux-64::python-dateutil-2.9.0post0-py312h06a4308_2 python-tzdata pkgs/main/noarch::python-tzdata-2023.3-pyhd3eb1b0_0 pytz pkgs/main/linux-64::pytz-2024.1-py312h06a4308_0 six pkgs/main/noarch::six-1.16.0-pyhd3eb1b0_1 Proceed ([y]/n)? y Downloading and Extracting Packages: Preparing transaction: done Verifying transaction: done Executing transaction: done

 

Related content