Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Comprehensive training guide

See the training guide with Sydney specific information for using NCI systems and services:

https://sydney-informatics-hub.github.io/usyd-gadi-onboarding-guide/

Job Submission

NCI Gadi uses the same job scheduler as Artemis but a more modern version (PBSpro 2024.1.1 vs PBSPro_13.1.0). Configuration and user experience options are fairly similar with some slight modifications.

Gadi

Code Block
#!/bin/bash
#PBS -P PANDORA                 
#PBS -l ncpus=1
#PBS –l mem=4GB 
#PBS -l walltime=10:00:00   
module load program   
cd "$PBS_O_WORKDIR” 
my_program 

Artemis

Code Block
#!/bin/bash
#PBS -P PANDORA                 
#PBS -l select=1:ncpus=1:mem=4GB 
#PBS -l walltime=10:00:00      

module load program
cd "$PBS_O_WORKDIR”
my_program 

Storage

Storage options for. “/scratch” on gadi is essentially “unlimited”, but has an aggressive deletion policy for unused data. You can increase your /scratch quota by contacting help@nci.org.au. For more persistent storage you can use /g/data/<project> directory. Quota increases are done by the Scheme Manager.

NCI Gadi

Code Block
/scratch/<NCIproject>
/g/data/<NCIproject>

Artemis

Code Block
/scratch/<RDSproject>
/project/<RDSproject>

Connect to Sydney Research Data Storage (RDS)

NCI Gadi

Code Block
sftp <unikey>@research-data-ext.sydney.edu.au:/rds/PRJ-<project>

Artemis

Code Block
/rds/PRJ-<project>

Walltime

All queues on Gadi have at-most a 48 hour walltime in contrast to 21 days for Artemis. This is primarily for easier resource sharing and prevention of wasted compute time (if a node fails or a job is not behaving as expected). Tips for running jobs in a short walltime environment.

  • Enable checkpointing in your software.

  • Break long running jobs into shorter chunks of work.

  • Make use of dependent compute jobs (-W depend=afterok:jobid).

Internet Access

Compute nodes on Gadi do not have access to the internet.

Use copyq.

Use ARE jobs.