J E T

Research Cluster for Staff 🔒

Access

Request access / will be done for you by your supervisor.
As Staff, access using SSH - How to SSH / VNC / VPN
Inform yourself: Getting Started
Connect to Jet
Load environment (libraries, compilers, interpreter, tools)
Checkout Code, Program, Compile, Test
Submit to compute nodes using slurm

Welcome to the HPC @IMG @UNIVIE and please follow these steps to become a productive member of our department and make good use of the computer resources. Efficiency is keen.

System Information

Last Update: 19.02.2025

Node Setup:

2x Login Nodes (jet01, jet02)
7x Compute Nodes INTEL (jet03-09)
10x Compute Nodes AMD (jet10-jet19)
5x Storage Nodes

GPFS

Example INTEL Node

Type	Detail
Product	ThinkSystem SR630
Processor	Intel(R) Xeon(R) Gold 6148 (Skylake) CPU @ 2.40GHz
Cores	2 CPU, 20 physical cores per CPU, total 80 logical CPU units
CPU Time	350 kh
Memory	24x 32GB - 768 GB Total
Memory/Core	19.2 GB
Network	100 Gbit/s (Infiniband)
OS	Rocky Linux 8.8 (Green Obsidian)
Purchase	June 2020

Example AMD Node

Type	Detail
Product	ThinkSystem SR635 V3
Processor	AMD EPYC 9454P (Genoa, Zen4) 48-Core Processor
Cores	1 CPU, 48 physical cores per CPU, total 96 logical CPU units
CPU Time	420 kh
Memory	12x 96GB - 1152 GB Total
Memory/Core	24 GB
Network	200 Gbit/s (Infiniband)
OS	Rocky Linux 8.8 (Greeen Obsidian)
Purchase	June 2024

Storage

All nodes are connected to a global file system (GPFS) with about 3.5 PB (~3500 TB) of storage. There is no need to copy files to the compute nodes, your HOME and SCRATCH directories will be available under the same path as on the login nodes.

Paths:

/jetfs/home/[username]
/jetfs/scratch/[username]
/jetfs/shared-data

JETFS on VSC

Only from VSC5 you can access JETFS. Not the other way around.

Software

The typcial installation of a intel-cluster has the INTEL Compiler suite (intel-parallel-studio or intel-oneapi-compilers) and the open source GNU Compilers installed. Based on these two different compilers (intel, gnu), there are usually two version of each scientific software.

Major Libraries:

OpenMPI (3.1.6, 4.0.5, 4.1.1)
HDF5
NetCDF (C, Fortran)
ECCODES from ECMWF
Math libraries e.g. intel-mkl, lapack,scalapack
Interpreters: Python, Julia
Tools: cdo, ncl, nco, ncview

These software libraries are usually handled by environment modules. Need another library 🯄 mail to IT

Currently installed modules

 2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 /jetfs/spack/share anaconda3/2020.11-gcc-8.5.0-gf52svn anaconda3/2021.05-gcc-8.5.0-gefwhbz cdo/1.9.10-gcc-8.5.0-y4q2l2h cdo/2.0.1-gcc-8.5.0-xgalz67 eccodes/2.18.0-intel-20.0.2-6tadpgr eccodes/2.19.1-gcc-8.5.0-74y7rih eccodes/2.19.1-gcc-8.5.0-MPI3.1.6-q3prgpi eccodes/2.21.0-gcc-8.5.0-lq54nls eccodes/2.21.0-gcc-8.5.0-MPI3.1.6-uu4b62w eccodes/2.21.0-intel-2021.4.0-cscplox eccodes/2.21.0-intel-2021.4.0-xnc5g2f gcc/8.5.0-gcc-8.5rhel8-7ka2e42 gcc/9.1.0-gcc-8.5rhel8-hmyhbce geos/3.8.1-gcc-8.5.0-bymxoyq geos/3.9.1-gcc-8.5.0-smhcud5 geos/3.9.1-intel-2021.4.0-wdqirxs hdf5/1.10.7-gcc-8.5.0-MPI3.1.6-zia454a hdf5/1.10.7-gcc-8.5.0-t247okg hdf5/1.10.7-intel-2021.4.0-l6tbvga hdf5/1.10.7-intel-2021.4.0-n7frjgz hdf5/1.12.0-intel-20.0.2-ezeotzr intel-mkl/2020.3.279-intel-20.0.2-m7bxged intel-mkl/2020.4.304-intel-2021.4.0-mcf5ggn intel-oneapi-compilers/2021.4.0-gcc-9.1.0-x5kx6di libemos/4.5.9-gcc-8.5.0-vgk5xbg libemos/4.5.9-intel-2021.4.0-2q2qpc3 miniconda2/4.7.12.1-gcc-8.5.0-hkx7ovs miniconda3/4.10.3-gcc-8.5.0-eyq4jvx nco/4.9.3-intel-20.0.2-dhlqiyo nco/5.0.1-gcc-8.5.0-oxngdn5 ncview/2.1.8-gcc-8.5.0-c7tcblp ncview/2.1.8-intel-20.0.2-3taqdda netcdf-c/4.6.3-gcc-8.5.0-MPI3.1.6-2ggkkoh netcdf-c/4.6.3-intel-2021.4.0-eaqh45b netcdf-c/4.7.4-gcc-8.5.0-o7ahi5o netcdf-c/4.7.4-intel-20.0.2-337uqtc netcdf-c/4.7.4-intel-2021.4.0-vvk6sk5 netcdf-fortran/4.5.2-gcc-8.5.0-MPI3.1.6-needvux netcdf-fortran/4.5.2-intel-2021.4.0-6avm4dp netcdf-fortran/4.5.3-gcc-8.5.0-3bqsedn netcdf-fortran/4.5.3-intel-20.0.2-irdm5gq netcdf-fortran/4.5.3-intel-2021.4.0-pii33is netlib-lapack/3.9.1-gcc-8.5.0-ipqdnxj netlib-scalapack/2.1.0-gcc-8.5.0-bukelua netlib-scalapack/2.1.0-gcc-8.5.0-MPI3.1.6-rllmmt4 openmpi/3.1.6-gcc-8.5.0-ie6e7fw openmpi/3.1.6-intel-20.0.2-ubasrpk openmpi/4.0.5-gcc-8.5.0-ryfwodt openmpi/4.0.5-intel-20.0.2-4wfaaz4 parallel-netcdf/1.12.1-intel-20.0.2-sgz3yqs parallel-netcdf/1.12.2-gcc-8.5.0-MPI3.1.6-y4btiof parallel-netcdf/1.12.2-intel-2021.4.0-bykumdv perl/5.32.0-intel-20.0.2-2d23x7l proj/7.1.0-gcc-8.5.0-k3kp5sb proj/7.1.0-intel-2021.4.0-bub3jtf proj/8.1.0-gcc-8.5.0-4ydzmxc proj/8.1.0-intel-2021.4.0-omzgfdy zlib/1.2.11-intel-20.0.2-3h374ov

------------- /jetfs/spack/s intel-parallel-studio/composer.2017.7-intel-17.0.7-disfj2g enstools/v2020.11  enstools/v2021.11 /opt/spack-jet01/s anaconda3/2020.11-gcc-8.3.1-bqubbbt 

on how to use environment modules go to Bash

 1 class="code">$ module av /spack/modules/linux-rhel8-skylake_avx512 ----------                        class="w">            class="w">             class="w">             class="w">                        class="w">                  class="w">                    hare/spack/modules/linux-rhel8-haswell ------------- class="w">   /jetfs/manual/modules ----------------------------- class="w">  teleport/10.1.4   hare/spack/lmod/linux-rhel8-skylake_avx512 ---------  href="../Misc/Environment-Modules.html">Using Environment Modules

Jupyterhub

The Jet Cluster serves a jupyterhub with a jupyterlab that launches on the JET cluster compute nodes and allows users to work directly on the cluster as well as submit jobs.
Steps: 

https://jupyter.wolke.img.univie.ac.at from within the VPN or UNI-Network.
Login with your Jet Credentials
Choose a job size
The jupyterlab will be launched and will be available to you until you log out or the walltime exceeds (depends on the job you lauch).

Please use the resources responsible. We trust that you apply a fair-share policy and collaborate with your colleagues.



There are several kernels available as modules and how to use other kernels can be found here:

Tutorial on Jet
A conda environment
Remote kernels

User Quotas and Restrictions
Currently there are not restrictions on the duration or the resources you can request. On JET the nodes can be shared between jobs, whereas on VSC nodes are job exclusive. Please follow these rules of collaboration:
Jobs:

Number of CPUs, keyword: ntasks e.g. 1 Node == 2x20 physcial cores
Memory, keyword: mem e.g. each Node up to 754 GB
Runtime, keyword: time e.g. try to split jobs into pieces.

Consider the following example:

You can use 1 node relatively easy for more than 3 days with your jobs running, but do not use all nodes an block them for all other users for 3 days. If you need multiple nodes, split the jobs into shorter runtimes. In general it is better to have more smaller jobs that are processed in a chain. Also try not to use too much resources that are not used. 

Have a look at resources used in your jobs using the /usr/bin/time command or look here.
Sample Job
Slurm example on JET
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17 #!/bin/bash
# SLURM specific commands
#SBATCH --job-name=test-run
#SBATCH --output=test-run.log
#SBATCH --ntasks=1
#SBATCH --mem=1MB
#SBATCH --time=05:00
#SBATCH --mail-type=BEGIN    # first have to state the type of event to occur 
#SBATCH --mail-user=<email@address.at>   # and then your email address

# Your Code below here
module load miniconda3
# Execute the miniconda Python
# use /usr/bin/time -v [program]
# gives statistics on the resources the program uses
# nice for testing
/usr/bin/time -v python3 -v

Storage Limitations are set mainly to the HOME directory (default: 100 GB), but there are some general restrictions as well.
Login nodes
On the Login Nodes (jet01/jet02) processes can run without any queue. However, please make sure that other users are not affected to much when these nodes are used for processing.
On Jet02 the jupyterhub is running and on jet01 a vnc server can be launched using gui applications.
How to use a vnc server, go to VNC.
Network drives
Transfer of files between SRV and JET is not necessary. The file system is mounted on JET Nodes JET01/JET02 and vice versa. These mounted drives need to transfer the data via the network and latencies might be higher. Network connectivity between JET and Aurora can be 25Gbit/s.
Mounted files systems
1
2
3
4
5
6
7
8 $ df -h 
131.130.157.5:/mnt/users/staff      319T  300T   20T  95% /srvfs/home
131.130.157.5:/mnt/users/scratch    319T  300T   20T  95% /srvfs/tmp
131.130.157.5:/mnt/users/data       319T  300T   20T  95% /srvfs/data
131.130.157.5:/mnt/scratch/scratch  400T  321T   80T  81% /srvfs/scratch
131.130.157.5:/mnt/scratch/shared   400T  321T   80T  81% /srvfs/shared
131.130.157.5:/mnt/scratch/webdata  400T  321T   80T  81% /srvfs/webdata
remjetfs                            3.6P  1.6P  2.0P  44% /jetfs

Network transfer speed. 1TB of data.
1
2
3
4
5      |         1         5         10         15 min
 10Gb -----------------------------------     13m20s
 25Gb --------------------                     5m20s
100Gb ------------                             1m20s
200Gb ----                                       40s

The 100Gb and 200Gb network connectivity is based on infiniband, which is even faster, for latencies.
Slurm
The job manager is called slurm and is available on numerous other HPC systems in the EU. There are endless online documentations that can be asked for some guidance. Please have a look at the VSC tutorials or training courses. 
There is some more information about how to use slurm:

Summary
a more advanced Slurm Tutorial on Gitlab (🔒 staff only)
VSC Slurm introduction
VSC SLURM presentation
Slurm Quick Start Guide - Manual Page

Queues
We have several queue on JET that should allow optimal start for different scale jobs.



name
timelimit
nodes
exclusive
comment




general
3 days
15
-
default


devel
30 min
15
-
higher priority


long
no
16
1=jet03
low priority


jhub
2 days
16
1=jet10
reserved for Jupyterhub



There are two different node architectures, which can be used to stir computation on a specific microarchitecture. There are 7 intel (skylake) and 10 amd (zen4) architecture nodes available. One can select these specific architecture by providing additional parameters (constraint) to a job:
Slurm job contraint, microarchitecture
1
2 #SBATCH --constraint="skylake" 
#SBATCH --constraint="zen4"

Job efficiency reports
since 2024 there is a new feature that allows to check how well one's jobs ran and get information on the efficiency of the resources used. The report is available once the job has finished. 
Job efficiency report
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15 # get a jobs efficiency report
seff [jobid]
# example showing only 3% memory and 45% cpu efficiency!
seff 2614735
Job ID: 2614735
Cluster: cluster
User/Group: /vscusers
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 30
CPU Utilized: 01:00:33
CPU Efficiency: 41.05% of 02:27:30 core-walltime
Job Wall-clock time: 00:04:55
Memory Utilized: 596.54 MB
Memory Efficiency: 2.91% of 20.00 GB








  
    
  
  
    
  


  
    
      
  
    
      
    
    March 18, 2025
  

    
    
      
  
    
      
    
    March 24, 2023

name	timelimit	nodes	exclusive	comment
general	3 days	15	-	default
devel	30 min	15	-	higher priority
long	no	16	1=jet03	low priority
jhub	2 days	16	1=jet10	reserved for Jupyterhub