
A simple example batch job with Python on Alderaan

Prepare your python code, say hello.py with the line


and your batch script, say hello.sh in the same directory

 #SBATCH --job-name=hello
 #SBATCH --partition=math-alderaan
 #SBATCH --time=1:00:00            # Max wall-clock time 1 hour
 #SBATCH --ntasks=1                # number of cores 
 python3 hello.py

Then submit it

sbatch hello.sh

Batch job with Tensorflow on GPU using Singularity

Prepare two files as follows.

Example batch script alderaan_single_gpu.sh:

 #SBATCH --job-name=gpu
 #SBATCH --partition=math-alderaan-gpu
 #SBATCH --gres=gpu:a100:1
 #SBATCH --time=1-1:00:00                  # Max wall-clock time 1 day 1 hour
 #SBATCH --ntasks=1                        # number of cores

 # run tensorflow in singularity container
 # redirect output to a file so that it can be inspected before the end of the job
 singularity exec /storage/singularity/tensorflow.sif python3 gpucode.py >& gpucode.log 
 # copy the output to the job output for reference
 cat gpucode.log

Please do not use Alderaan GPUs without allocating them by --gres as above first.

Example python code gpucode.py:

 print('gpus available to tensorflow:')
 from tensorflow.python.client import device_lib    

Submit your batch job by:

 sbatch alderaan_single_gpu.sh

You can find the files above in the repository https://github.com/ccmucdenver/templates

Interactive job with Tensorflow on GPU

Look which host you are on. You should be on the head node.

 > hostname

Start an interactive job on a GPU node

 > srun -p math-alderaan-gpu --time=02:00:0 -N 1 -n 1 --pty bash -i --gres=gpu:a100:1

Check where you are. If a GPU is available, you should be on a high memory/gpu node with a GPU allocated to you.

 > hostname

Start a shell in a container

 > singularity shell /storage/singularity/tensorflow.sif

Do your python:

 Singularity> python3
 Python 3.8.10 (default, Nov 26 2021, 20:14:08)
 [GCC 9.3.0] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> from tensorflow.python.client import device_lib

When you are done, exit so that someone else can use the gpu node

 >>> exit()

MPI examples

MPI examples are available at https://github.com/ccmucdenver/templates.git To get your own copy,

git clone https://github.com/ccmucdenver/templates.git

MPI is on alderaan only