Attention: Confluence is not suitable for the storage of highly confidential data. Please ensure that any data classified as Highly Protected is stored using a more secure platform.
If you have any questions, please refer to the University's data classification guide or contact ict.askcyber@sydney.edu.au

Transitioning from Artemis to NCI Gadi

Comprehensive training guide

See the training guide with Sydney specific information for using NCI systems and services:

https://sydney-informatics-hub.github.io/usyd-gadi-onboarding-guide/

Job Submission

NCI Gadi uses the same job scheduler as Artemis but a more modern version (PBSpro 2024.1.1 vs PBSPro_13.1.0). Configuration and user experience options are fairly similar with some slight modifications.

Gadi

#!/bin/bash #PBS -P PANDORA                  #PBS -l ncpus=1 #PBS –l mem=4GB  #PBS -l walltime=10:00:00    module load program    cd "$PBS_O_WORKDIR” my_program

Artemis

#!/bin/bash #PBS -P PANDORA                  #PBS -l select=1:ncpus=1:mem=4GB  #PBS -l walltime=10:00:00       module load program cd "$PBS_O_WORKDIR” my_program

 

Storage

Storage options for. “/scratch” on gadi is essentially “unlimited”, but has an aggressive deletion policy for unused data. You can increase your /scratch quota by contacting help@nci.org.au. For more persistent storage you can use /g/data/<project> directory. Quota increases are done by the Scheme Manager.

NCI Gadi

/scratch/<NCIproject> /g/data/<NCIproject>

Artemis

/scratch/<RDSproject> /project/<RDSproject>

Connect to Sydney Research Data Storage (RDS)

NCI Gadi

sftp <unikey>@research-data-ext.sydney.edu.au:/rds/PRJ-<project>

Artemis

/rds/PRJ-<project>

Walltime

All queues on Gadi have at-most a 48 hour walltime in contrast to 21 days for Artemis. This is primarily for easier resource sharing and prevention of wasted compute time (if a node fails or a job is not behaving as expected). Tips for running jobs in a short walltime environment.

  • Enable checkpointing in your software.

  • Break long running jobs into shorter chunks of work.

  • Make use of dependent compute jobs (-W depend=afterok:jobid).

 

Internet Access

Compute nodes on Gadi do not have access to the internet.

Use copyq.

Use ARE jobs.

 

Job Arrays

PBS job arrays (#PBS -J 1-10): these are not permitted on Gadi. Other means of parallel task execution are required. An example using OpenMPI and the custom utility ‘nci-parallel’ are demonstrated with this example parallel job.