Using TensorFlow on Beartooth
Although ARCC will endeavor to keep this page up-to-date, TensorFlow is under continuous development and we might be playing catchup. If you think this page is out-of-date, please notify ARCC via our portal, and please refer to the TensorFlow Install page to check for the latest approach.
Overview
TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.
General Install process for Beartooth:
The basic process to setting up conda, TensorFlow and running a script is:
Step though creating a basic Conda environment - CPU vs GPU.
Provide a template for a bash script to submit jobs using sbatch.
Provide a very simple script that tests TensorFlow has been imported and can identify the allocated GPU.
Note:
This is a short page and assumes some familiarization with using Conda.
The installation of TensorFlow within the conda environment will also install related dependencies, but nothing else. Since you’re creating the conda environment, you can extend and install other packages. You can view the conda packages installed using
conda list
while in an active environment.The bash script only uses a single node and single core. It is up to the user to explore other configurations.
In the scripts and examples below, please remember to appropriately edit to use your account, email address, folder locations etc.
Setting Up Conda Environment
The process below is an example. Please be ware you’ll need to replace <project-name>
with your project, and that <username>
represents your username..
Bash Script to use with sbatch (with GPU)
Below is a basic template to use that you’ll need to insert your account, email and path to the conda environment you created:
Simple Source Code Example
Below is some very simple source code that will test your environment and GPU request is functioning properly.
It simply imports the tensor package, and then using this checks that it can identify the allocated GPU(s). To work with the bash script above, save this file as tf_test.py
Example Output
Issues
As mentioned, TensorFlow is under constant development and errors/bugs will creep in.
For example, when installing version 2.16.1
, GPUs were not being detecting: TF 2.16.1 Fails to work with GPUs #63362 - this is why in the example above we explicitly installed version 2.15.1.