Goal: Demonstrate creating a module file to load a Conda environment.
Remember what a module is
We use the LMOD module system to setup our environment by loading modules of compilers, languages, libraries and applications…
Although typically the module are setup by ARCC, you can create your own module files for software and environments that you have installed (within home/project). You can create your own module files that expose and load these.
Module File Basic Template
An Introduction to Writing Modulefiles: Module files are scripted using Lua and define core environment elements required to setup and use something.
Example:
[salexan5@mblog2 ~]$ cd /project/arcc/software [salexan5@mblog2 software] mkdir -p modules/tensorflow [salexan5@mblog2 software] cd modules/tensorflow [salexan5@mblog2 tensorflow] vim 2.16.lua
# 2.16.lua whatis(" Name: TensorFlow ") whatis(" Version : 2.16 ") whatis(" Short Description: An end-to-end platform for machine learning.") prepend_path("PATH","/project/arcc/software/tensorflow/2.16/bin/")
Use Your Module Files
To expose your module files you need to tell the module system where to look using the module use <path>
command.
[salexan5@mblog1 ~]$ module avail -------------- /apps/s/lmod/mf/opt/linux-rhel9-x86_64/containers --------------- stress-ng/0.17.08 ...
# Expose your module files. [salexan5@mblog1 ~]$ module use /project/arcc/software/modules/
[salexan5@mblog1 ~]$ module avail ------------------------ /project/arcc/software/modules ------------------------ tensorflow/2.16 -------------- /apps/s/lmod/mf/opt/linux-rhel9-x86_64/containers --------------- stress-ng/0.17.08 ...
[salexan5@mblog2 ~]$ module spider tensorflow ---------------------------------------------------------------------------- tensorflow: tensorflow/2.16 ---------------------------------------------------------------------------- This module can be loaded directly: module load tensorflow/2.16
Using Your Module
Loading a local module has the same effect as using a System module.
Your environment will be updated by setting appropriate environment variables.
[salexan5@mblog1 ~]$ salloc -A arcc -t 10:00 salloc: Granted job allocation 845851 salloc: Nodes mbcpu-001 are ready for job [salexan5@mbcpu-001 ~]$ module use /project/arcc/software/modules/ [salexan5@mbcpu-001 ~]$ module load tensorflow/2.16 [salexan5@mbcpu-001 ~]$ python --version Python 3.11.9 [salexan5@mblog2 tensorflow]$ which python /project/arcc/software/tensorflow/2.16/bin/python
Notice the sysconfig
After loading our module and exposing the conda environment, notice how the sysconfig
is based around this environment.
[salexan5@mblog2 tensorflow]$ python -m sysconfig Platform: "linux-x86_64" Python version: "3.11" Current installation scheme: "posix_prefix" Paths: data = "/project/arcc/software/tensorflow/2.16" ... stdlib = "/project/arcc/software/tensorflow/2.16/lib/python3.11" Variables: ... userbase = "/cluster/medbow/project/arcc/software/tensorflow/2.16"
Notice that the userbase
is pointing to where our conda environment was installed.
Running our TF Code
All the conda and pip package installs within this conda environment are available.
[salexan5@mbcpu-001 ~]$ python -c "import tensorflow as tf; print(\"TensorFlow Version: \" + str( tf.__version__))" ... TensorFlow Version: 2.16.1 [salexan5@mbcpu-001 ~]$ exit exit salloc: Relinquishing job allocation 845851