Data Transfer Queue (dtq)

Artemis has a queue, called dtq, that has nodes with access to the research data store (mounted under /rds) and the internet. This queue is dedicated to running data moving jobs and other input/output (I/O) intensive work, such as compressing and archiving files, copying data from Artemis to the Research Data Store, or copying data to and from remote locations. These nodes have dedicated ethernet and infiniband bandwidth to maximise file transfer speeds, so they deliver significantly better data transfer and I/O performance than the login nodes. No other compute nodes on Artemis have access to /rds or the internet. The login nodes can still be used for small I/O work, but heavy copy or I/O work should be submitted to dtq instead.

An example dtq job, which moves the mydata directory from Artemis to RDS, using an example project called PANDORA, is shown below:

#!/bin/bash
#PBS -P PANDORA
#PBS -q dtq
#PBS -l select=1:ncpus=1:mem=4gb,walltime=1:00:00
rsync -rtvxP /project/PANDORA/mydata /rds/PRJ-PANDORA/

To run this script, save it to a file (for example my-copy-job.pbs) and submit it to the scheduler:

qsub my-copy-job.pbs

The data transfer queue is designed for running file manipulation commands. For example, you could use wget (for downloading data from the internet), scp, tar, cp, mv, rm, plus more. This queue, however, is not intended for compute jobs. Compute jobs running in this queue will be terminated without notice. Since no real compute jobs are allowed to run in this queue, jobs run in dtq will not contribute to your project’s fair share. However, your project’s current fair share value will impact the priority of jobs submitted to this queue.

Resource limits for dtq can be found in the queue resource limits table.