Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

Goal: Demonstrate creating a module file to load a Conda environment.


Remember what a module is

We use the LMOD module system to setup our environment by loading modules of compilers, languages, libraries and applications…

Although typically the module are setup by ARCC, you can create your own module files for software and environments that you have installed (within home/project). You can create your own module files that expose and load these.


Module File Basic Template

An Introduction to Writing Modulefiles: Module files are scripted using Lua and define core environment elements required to setup and use something.

Example:

[salexan5@mblog2 ~]$ cd /project/arcc/software
[salexan5@mblog2 software] mkdir -p modules/tensorflow
[salexan5@mblog2 software] cd modules/tensorflow
[salexan5@mblog2 tensorflow] vim 2.16.lua
# 2.16.lua
whatis(" Name: TensorFlow ")
whatis(" Version : 2.16 ")
whatis(" Short Description: An end-to-end platform for machine learning.")
prepend_path("PATH","/project/arcc/software/tensorflow/2.16/bin/")

Use Your Module Files

To expose your module files you need to tell the module system where to look using the module use <path> command.

[salexan5@mblog1 ~]$ module avail
-------------- /apps/s/lmod/mf/opt/linux-rhel9-x86_64/containers ---------------
   stress-ng/0.17.08
...
# Expose your module files.
[salexan5@mblog1 ~]$ module use /project/arcc/software/modules/
[salexan5@mblog1 ~]$ module avail
------------------------ /project/arcc/software/modules ------------------------
   tensorflow/2.16
-------------- /apps/s/lmod/mf/opt/linux-rhel9-x86_64/containers ---------------
   stress-ng/0.17.08
...
[salexan5@mblog2 ~]$ module spider tensorflow
----------------------------------------------------------------------------
  tensorflow: tensorflow/2.16
----------------------------------------------------------------------------
    This module can be loaded directly: module load tensorflow/2.16

Using Your Module

Loading a local module has the same effect as using a System module.

Your environment will be updated by setting appropriate environment variables.

[salexan5@mblog1 ~]$ salloc -A arcc -t 10:00
salloc: Granted job allocation 845851
salloc: Nodes mbcpu-001 are ready for job
[salexan5@mbcpu-001 ~]$ module use /project/arcc/software/modules/
[salexan5@mbcpu-001 ~]$ module load tensorflow/2.16
[salexan5@mbcpu-001 ~]$ python --version
Python 3.11.9

[salexan5@mblog2 tensorflow]$ which python
/project/arcc/software/tensorflow/2.16/bin/python

Notice the sysconfig

After loading our module and exposing the conda environment, notice how the sysconfig is based around this environment.

[salexan5@mblog2 tensorflow]$ python -m sysconfig
Platform: "linux-x86_64"
Python version: "3.11"
Current installation scheme: "posix_prefix"
Paths:
        data = "/project/arcc/software/tensorflow/2.16"
        ...
        stdlib = "/project/arcc/software/tensorflow/2.16/lib/python3.11"
Variables:
        ...
        userbase = "/cluster/medbow/project/arcc/software/tensorflow/2.16"

Notice that the userbase is pointing to where our conda environment was installed.


Running our TF Code

All the conda and pip package installs within this conda environment are available.

[salexan5@mbcpu-001 ~]$ python -c "import tensorflow as tf; print(\"TensorFlow Version: \" + str( tf.__version__))"
...
TensorFlow Version: 2.16.1
[salexan5@mbcpu-001 ~]$ exit
exit
salloc: Relinquishing job allocation 845851

  • No labels