Attention: : Confluence is not suitable for the storage of highly confidential data. Please ensure that any data classified as Highly Protected is stored using a more secure platform.
If you have any questions, please refer to the University's data classification guide or contact ict.askcyber@sydney.edu.au

Running containers with Singularity

Singularity is a tool that lets you run containers without root access. For an introduction to Singularity, please read the Singularity documentation found at https://www.sylabs.io/docs/.

On Artemis, we cannot let you build Singularity containers because we cannot grant you root access. Instead, you can install Singularity on your own computer and build your Singularity containers there, then transfer the built images to Artemis HPC. Run “module avail singularity” to see what versions of Singularity are available on Artemis.

Singularity is being heavily developed with a high frequency of updates and the versions on Artemis will not always be the latest, nor have all features and capabilities enabled. Additionally, there may be changes in Singularity that prevent later versions being operable on Artemis and thus it is not assured any given singularity container will be compatible with Artemis. The versions of Singularity on Artemis are provided as-is and whilst efforts will be made to provide assistance within the above parameters, problems may not always be solvable.

Singularity gives you control over your software

Singularity is a powerful tool for creating reproducible, portable software environments. However, this control comes with the responsibility to maintain your own containers. Due to the almost infinite possible number of containerised applications available, we cannot help troubleshoot or support applications running inside Singularity containers. If your containerised application is not working, we recommend contacting the container developer for assistance.

If you believe there is an issue with the installation of Singularity on Artemis, or an issue with the underlying Artemis hardware, open a High-Performance Computing request via the Services Portal portal. Tickets requesting support for applications running inside a container will be silently closed.

Building Artemis-compatible Singularity containers

We cannot enable OverlayFS for Singularity on Artemis, so you will need to make sure your containers have both /project and /scratch directories before copying your container to Artemis. If you’re running GPU jobs, the file /usr/bin/nvidia-smi needs to exist as well (but doesn’t need to be nvidia-smi itself).

This means the command “singularity pull docker://some/container:latest” will not work when run on Artemis. Instead, build containers on a development computer where you have singularity installed and root access. You can add /project and /scratch directories and the file /usr/bin/nvidia-smi in the %post section of a Singularity build script. For example:

BootStrap: docker
From: ubuntu:16.04

%post
mkdir /project /scratch
touch /usr/bin/nvidia-smi

If you save this script to a text file, say singularity.build, then you can build your container from DockerHub with the following command:

sudo singularity build MyContainer.simg singularity.build

This will create a Singularity image called MyContainer.simg.

When your container runs, Artemis will bind the following Artemis directories inside your container. Anything stored in your container in these directories will not exist when executed on Artemis; you will see Artemis’s files in these directories only:

/home
/project
/scratch
/sys
/proc
/dev
/tmp
/etc/localtime
/etc/hosts

In particular, note that /home is replaced by your Artemis home directory, so anything stored in your home directory inside your container will not be accessible from Artemis.

An example PBS script to run Singularity containers on Artemis is:

#!/bin/bash
#PBS -P PROJECT
#PBS -l select=1:ncpus=4:mem=4gb
#PBS -l walltime=1:00:00
 
module load singularity/2.6.1

cd "$PBS_O_WORKDIR"
singularity exec MyContainer.simg python3 myscript.py

As your environment is installed in the container, any other libraries installed on Artemis won’t be seen inside the container. The exception to this is MPI libraries and GPU libraries.

GPU containers

GPU containers can be run on Artemis. You can bind Artemis's GPU libraries inside your Singularity container by using the --nv flag. This will ensure your container can use Artemis’s GPUs.

#!/bin/bash
#PBS -P PROJECT
#PBS -l select=1:ncpus=1:ngpus=1:mem=4gb
#PBS -l walltime=1:00:00

module load singularity/2.6.1

cd "$PBS_O_WORKDIR"
singularity exec --nv MyContainer.simg python3 myscript.py

MPI containers

If you have MPI programs and you want them to run on multiple nodes, then you need to bind Artemis’s host MPI and network libraries inside your container, then tell your MPI program to use these libraries by editing your container LD_LIBRARY_PATH environment variable. I have had moderate success using OpenMPI version 3.0.1 on Artemis inside a CentOS 6.9 container, but I haven’t tested any other MPI versions or container OS versions.

The solution below is not optimal. However, the below solution may serve as a good starting point for further optimisation. An example MPI job script, using openmpi-gcc/3.0.1 on Artemis, in a container built using CentOS 6.9, is below:

#!/bin/bash
#PBS -P PROJECT
#PBS -l select=2:ncpus=12:mpiprocs=12:mem=16gb
#PBS -l walltime=1:00:00


module load singularity/2.5.2
module load openmpi-gcc/3.0.1
export SINGULARITYENV_PREPEND_PATH=/usr/local/openmpi-gcc/3.0.1/bin
export SINGULARITYENV_LD_LIBRARY_PATH=/usr/local/openmpi-gcc/3.0.1/lib:/usr/local/gcc/4.9.3/lib

cd "$PBS_O_WORKDIR"
mpirun -np 24 singularity exec -B "/usr/local/openmpi-gcc/3.0.1,/usr/local/gcc/4.9.3,/lib64,/usr/lib64" MyContainer.simg myMPIexecutable &> logfile.log

If you only want to run your MPI job on a single node, then you don’t have to call mpirun before the singularity call. This will use MPI installed inside the container only and not use Artemis’s MPI at all, and will still have performance equivalent to “bare metal” on Artemis.

Example to run a Dockerfile on Artemis with Singularity

Start with a Dockerfile, like this one, which contains the Trinity platform.


To run this on your computer where you have root access, you probably only need to:

#In the directory where you have saved the Dockerfile
docker build -t trinity .

#Then this particular Dockerfile can be interfaced/run like so
sudo docker run --rm -v`pwd`:`pwd` trinity Trinity --seqType fq --single `pwd`/reads.fq.gz --max_memory 1G --CPU 4 --output `pwd`/trinity_out_dir


But if we want to run it with Singularity we need to do a few more steps.

Firstly, we must convert the Dockerfile to a Singularity image, an easy way is to use the Python package Spython, https://vsoch.github.io/singularity-cli/recipes

Then as per Spython's instructions you can convert the Dockerfile by: 

spython recipe Dockerfile >> Singularity.trinity


Next, edit the newly created Singularity image file. Add the Artemis-specific changes to your new singularity image, i.e.

Bootstrap: docker
From: ubuntu:16.04
%files
%Dockerfile $SRC/Dockerfile.$TRINITY_VERSION
%labels
MAINTAINER bhaas@broadinstitute.org
%post
mkdir /project /scratch


Next use Singularity to build the image (~30 minutes on my local laptop).
This image/container will be around 2GB as it contains trinity+all the dependencies+the whole operating system. This is fairly standard for Singularity images.

sudo singularity build Singularity.trinity.build Singularity.trinity


Now move the build to Artemis (with scp, filezilla, etc) and run it with a PBS script:

#!/bin/bash
#PBS -P PROJECT
#PBS -l select=1:ncpus=4:mem=4gb
#PBS -l walltime=0:10:00
 
module load singularity/2.6.1

#Change into the directory you executed qsub from
cd "$PBS_O_WORKDIR"

#Run the singularity image, which contains the program Trinity
#The dataset "reads.fq.gz" should be in this directory, 
#This will create persistent output in a folder called "trinity_out_dir"
singularity exec Singularity.trinity.build Trinity --seqType fq --single `pwd`/reads.fq.gz --max_memory 4G --CPU $NCPUS --output `pwd`/trinity_out_dir