Attention: Confluence is not suitable for the storage of highly confidential data. Please ensure that any data classified as Highly Protected is stored using a more secure platform.
If you have any questions, please refer to the University's data classification guide or contact ict.askcyber@sydney.edu.au
Sydney GPU Cluster
Coming Soon!
How to get access
Access model coming soon! Fill in the expression of interest form.
Contact sih.info@sydney.edu.au with any questions.
Inference end points
Coming soon!
Raw GPU Access
Machine Specs
Operating System
NVIDIA Base Command Manager with DGX OS 7
Installed on head nodes, Kubernetes master nodes, and DGX nodes
Includes built-in LDAP authentication
Job Scheduler & Cluster Management:
Three master nodes for managing the cluster
DGX nodes function as Kubernetes worker nodes
Installed on Kubernetes for workload scheduling and AI workload orchestration
Includes namespaces, secrets, backend, and API server configurations
GPU nodes
3 x NVIDIA DGX H200 with:
8 x H200 per node.
1128 GB VRAM per node.
CPU nodes
5 x Dell PowerEdge R760 Server with:
Primary and Secondary Head Nodes with High-Availability configuration for failover protection
Three master nodes for managing Kubernetes
2x Intel Xeon Gold 5418Y 2GHz, 24Cores/48Threads
16x 32GB RDIMM, 5600MT/s, Dual Rank
BOSS-N1 controller card + with 2 M.2 480GB (RAID 1)
6.4TB Enterprise NVMe Mixed Use AG Drive U.2 Gen4 with carrier
Broadcom 57416 Dual Port 10GbE BASE-T Adapter, OCP NIC 3.0
Mellanox ConnectX-6 DX Dual Port 100GbE QSFP56 Network Adapter
Storage
DDN EXAScaler Parallel File System 1 PB
Provides high-performance shared storage to DGX nodes
Connected via 8x 200GbE Active Optical Cables
1 x Controller: DDN ES400NVX2-NDR200-SE with:
2 x SE2420-EBOD NVMe Expansion Enclosures with 24 NVMe drives each
Total 48 x 30.72TB QLC NVMe G4 4K SSD drive
Networking
Cumulus Linux
Runs on NVIDIA Spectrum-2 200GbE switches
Supports MLAG, VLANs, and routing configurations
InfiniBand Networking for high-speed GPU-to-GPU communication
Out-of-Band (OOB) Management