Clusters at CÉCI
The aim of the Consortium is to provide researchers with access to powerful computing equipment (clusters). Clusters are installed and managed locally at the different sites of the universities taking part in the Consortium, but they are accessible by all researchers from the member universities. A single login/passphrase is used to access all clusters through SSH.
All of them run Linux, and use Slurm as the job manager. Basic parallel computing libraries (OpenMP, MPI, etc) are installed, as well as the optimized computing subroutines (e.g. BLAS, LAPACK, etc.). Common interpreters such as R, Octave, Python, etc. are also installed. See each cluster's FAQ for more details.
Cluster | Host | CPU type | CPU count* | RAM/node | Network | Filesystem** | Accelerator | Max time | Preferred jobs*** |
---|---|---|---|---|---|---|---|---|---|
Lemaitre4 | UCLouvain | Genoa 2.4 GHz | 5120 (40 x 128) | 766GB | HDR Ib | BeeGFS 320 TB | None | 2 days | MPI |
NIC5 | ULiège | Rome 2.9 GHz | 4672 (73 x 64) | 256 GB..1 TB | HDR Ib | BeeGFS 520 TB | None | 2 days | MPI |
Hercules2 | UNamur | Naples 2 GHz | 1024 (30 x 32 + 2 x 64) | 256 GB..2 TB | 10 GbE | NFS 80 TB | 8x NVidia A40 4x NVidia A6000 | 15 days | serial / SMP |
Dragon2 | UMons | SkyLake 2.60 GHz | 592 (17 x 32 + 2 x 24) | 192..384 GB | 10 GbE | RAID0 3.3 TB | 4x Volta V100 | 21 days | serial / SMP |
Dragon1 | UMons | SandyBridge 2.60 GHz | 416 (26 x 16) 32 (2x16) | 128 GB | GbE | RAID0 1.1 TB | 4x Tesla C2075, 4x Tesla Kepler K20m | 41 days | serial / SMP |
Lemaitre3* | UCL | SkyLake 2.3 GHz Haswell 2.6 GHz | 1872 (78 x 24) 112 (4 x 28) | 95 GB 64 GB | Omnipath | BeeGFS 440 TB | None | 2 days 6 hours | MPI |
NIC4* | ULiège | SandyBridge 2.0 GHz IvyBridge 2.0 GHz | 2048 (120 x 16 + 8 x 16) | 64 GB | QDR Ib | FHGFS 144 TB | None | 3 days | MPI |
Vega* | ULB | Bulldozer 2.1 GHz | 896 (14 x 64) | 256 GB | QDR Ib | GPFS 70 TB | None | 14 days | serial / SMP / MPI |
Hercules* | UNamur | SandyBridge 2.20 GHz | 512 (32 x 16) | 64..128 GB | GbE | NFS 20 TB | None | 63 days | serial / SMP |
Lemaitre2* | UCL | Westmere 2.53 GHz | 1380 (115 x 12) | 48 GB | QDR Ib | Lustre 120 TB | 3x Quadro Q4000 | 3 days | MPI |
Hmem* | UCL | MagnyCours 2.2 GHz | 816 (17 x 48) | 128..512 GB | QDR Ib | FHGFS 30 TB | None | 15 days | SMP |
Tier-1 Cluster
The Consortium also enables users with access to Tier-1 facilities, not operated by the universities. How to get acces
Cluster | Host | CPU type | CPU count* | RAM/node | Network | Filesystem** | Accelerator | Max time | Preferred jobs*** |
---|---|---|---|---|---|---|---|---|---|
Lucia | Cenaero | Milan 2.45 GHz Milan 2.6 GHz | 38400 (300 x 128) 1600 (50 x 32) | 241 GB 241 GB | HDR Ib HDR Ib | GPFS 3.2 PB | / 200 (50 x 4) Tesla A100 | 48 hours 48 hours | MPI GPU |
Zenobe | Cenaero | Haswell 2.50 GHz IvyBridge 2.7 GHz | 5760 (240 x 24) 8208 (342 x 24) | 64..256 GB 64 GB | QDR Ib FDR Ib + QDR Ib | GPFS 350 TB | t.b.a. | 24 hours | MPI |
CÉCI clusters capabilities comparison
Lemaitre4 | Dragon2 | NIC5 | Hercules2 |
The CÉCI clusters have been designed to accommodate the large diversity of workloads and needs of the researchers from the five universities. The graph on the left shows a polar plot (also known as spider plot) representation of the capabilities of the CECI clusters. On one end is the sequential workload. That type of workload needs very fast CPUs, accelerators, and often a large maximum job time (several weeks, or months!), requiring limitations on the number of jobs a user can run simultaneously to allow a fair sharing of the cluster. On the other end is the massively parallel workload. For such workloads, individual core performance is less crucial, as long as there are many available cores. A job will be allowed to use a very large number of CPUs per job, but only for a limited period of time (a few days maximum) to ensure a fair sharing of the cluster. Generally, parallel workloads necessitate of course a fast and low latency network and a large parallel filesystem. Finally, some workloads need huge amounts of memory be it RAM memory or local disk memory. Such workloads often also need many CPUs on the same node to take advantage of the large memory available (so-called "fat nodes"). |
The clusters have been installed gradually since early 2011, first at UCL, with HMEM being a proof of concept. At that time, the whole account infrastructure was designed and deployed so that every researcher from any university was able to create an account and login to HMEM. Then, LEMAITRE2 was setup as the first cluster entirely funded by the F.N.R.S. for the CÉCI. DRAGON1, HERCULES, VEGA and NIC4 have followed, in that order, as shown in the timeline here-under.
Common storage
We provide a central storage solution which is visible from all the frontends and compute nodes of all CÉCI clusters. This system is deployed on a private, dedicated, fast (10Gbps) network connecting all CÉCI sites. To move to your personal share on this common storage, it is just enough to do
cd $CECIHOME
from any of the CÉCI clusters. As that common share is mounted on all of them, each file you copy there will be accessible from any CÉCI cluster.
Please, take a careful look at the documentation to learn about the other shares for fast transfer of big files between clusters and for group projects.
Lemaitre4
Hosted at UCLouvain (CISM), this cluster consists of more than 5000 cores AMD Epyc Genoa at 3.7 GHz. . All the nodes are interconnected by a 100 Gbps Infiniband HDR interconnect. The compute nodes have access to a 320 TB fast BeeGFS /scratch space.
Suitable for:
MPI Parallel jobs (several dozens of cores) with many communications and/or a lot of parallel disk I/O, and SMP/OpenMP parallel jobs; 2 days max.
Resources
- Home directory (100 GB quota per user)
- Global working directory /scratch ($GLOBALSCRATCH)
- Node local working directory $LOCALSCRATCH dynamically defined in jobs
- default batch queue*
Access/Support:
SSH to lemaitre4.cism.ucl.ac.be
(port 22) through your university gateway,
with the appropriate login and id_rsa.ceci
file.
SUPPORT: CISM
Server SSH key fingerprint: (What's this?)
- ECDSA:
SHA256:krYWLlE32ygG0u8uYbXUNBRTpbxDoDVyCvg3B1zLvGQ
- ED25519:
SHA256:mWlgUkE+tBNbklXLgvrt7pL/3Ohn7uidqFfBUU0fSkQ
- RSA:
SHA256:NIhjzqQgxgkG7K1x4kqoFnNSGrbc9b8AUG8+JT68jg4


NIC5
Hosted at the University of Liège (SEGI), this cluster consists of 4672 cores spread across 73 compute nodes with two 32 cores AMD Epyc Rome 7542 CPUs at 2.9 GHz. The default partition holds 70 nodes with 256GB of RAM, and a second "hmem" partition with 3 nodes with 1TB of RAM is also available. All the nodes are interconnected by a 100 Gbps Infiniband HDR interconnect (blocking factor 1,2:1). The compute nodes have access to a 520 TB fast BeeGFS /scratch space.
Suitable for:
MPI Parallel jobs (several dozens of cores) with many communications and/or a lot of parallel disk I/O, and SMP/OpenMP parallel jobs; 2 days max.
Resources
- Home directory (100 GB quota per user)
- Global working directory /scratch ($GLOBALSCRATCH)
- Node local working directory $LOCALSCRATCH dynamically defined in jobs
- default batch queue* (Max 2 days, 256GB of RAM nodes)
- hmem queue* (Max 2 days, 1TB RAM nodes, only for jobs that cannot run on the 256GB nodes)
- Max 320 cpus per user
Access/Support:
SSH to nic5.uliege.be
(port 22) through your university gateway,
with the appropriate login and id_rsa.ceci
file.
FAQ: https://www.campus.uliege.be/nic5
SUPPORT: CECI support form
Server SSH key fingerprint: (What's this?)
- ECDSA:
SHA256:xKYPziAtsf0FwtIYYa3NDL1ibZGbhUCf9B5A8p0MR30
- ED25519:
SHA256:27uhpA+zocCxLayg5g1ogej/6zJnx3kLNOftg1IOXpE
- RSA:
SHA256:oHCr1TlkQb+4Sjq/9wzBmsd8v2QfP9jJJRO+L2284gU


HERCULES2
Hosted at the University of Namur, this system currently consists of 1536 cores spread across 30 AMD Epyc Naples and 32 Intel Sandy Bridge compute nodes. The group of AMD nodes are composed of 24 ones with a single 32-core AMD Epyc 7551P CPU at 2.0 GHz and 256 GB of RAM, 4 nodes with the same CPUs and 512 GB of RAM and 2 nodes with dual 32-core AMD Epyc 7501 CPU at 2.0 GHz and 2 TB of RAM. The Intel nodes have dual 8-core Xeon E5-2660 CPU at 2.2 GHz and 64 or 128 GB of RAM (8 nodes). All the nodes are interconnected by a 10 Gigabit Ethernet network and have access to three NFS file systems for a total capacity of 100 TB.
Suitable for:
Sared-memory parallel jobs (OpenMP or Pthreads), or resource-intensive sequential jobs, specially large in memory.
Resources
- Home directory (200 GB quota per user)
- Working directory /workdir (400 GB per user) ($WORKDIR)
- Local working directory /scratch ($LOCALSCRATCH) dynamically defined in jobs
- Nodes have access to internet
- default queue* (Max 15 days)
- hmem queue* (at least 64GB per core, Max 15 days)
- Max 128 cpus/user on all partitions
Access/Support:
SSH to hercules2.ptci.unamur.be
(port 22) with the appropriate
login and id_rsa.ceci
file.
SUPPORT: ptci.support@unamur.be
Server SSH key fingerprint: (What's this?)
MD5:66:50:e1:67:91:d8:17:1e:b7:be:48:00:e2:2c:7a:9f
- ED25519
SHA256:fHuc0Y+QuAZW2FrI9NXrfDt2CeDmVWD6wHeDW4I3ztw.
- RSA:
SHA256:SyLaaBe7CuO7Dpa6vJa0vbAUxnYSpl30xaJo5yBF//c


DRAGON2
Hosted at the University of Mons, this cluster is made of 17 computing nodes, each with two Intel Skylake 16-cores Xeon 6142 processors at 2.6 GHz, with 15 nodes having 192GB of RAM and 2 with 384GB, all of them with 3.3 TB of local scratch disk space. Two additional nodes with two Intel Skylake 12-cores Xeon 6126 processors at 2.6 GHz have each two high-end NVidia Tesla V100 GPUs (5120 CUDA Cores/16GB HBM2/7.5 TFlops double precision). The compute nodes are interconnected with a 10 Gigabit Ethernet network.
Suitable for:
Long (max. 21 days) shared-memory parallel jobs (OpenMP or Pthreads), or resource-intensive (cpu speed and memory) sequential jobs.
Resources
- Home directory (40GB quota per user)
- Local working directory $LOCALSCRATCH (/scratch)
- Global working directory $GLOBALSCRATCH (/globalscratch)
- No internet access from nodes
- long queue* (Max 21 days, 48 cpus/user)
- gpu queue* (Max 5 days – 24cpus/user 1/gpu/user )
- debug queue* (Max 30 minutes, 48 cpus/user)
- Generic resource*: gpu
Access/Support:
SSH to dragon2.umons.ac.be (port 22)
with the appropriate login and
id_rsa.ceci
file.
SUPPORT: CECI Support form
Server SSH key fingerprint: (What's this?)
MD5:0e:a7:21:df:a5:a0:27:6c:47:ba:61:57:76:d0:82:ad
SHA256:LEX1JwKes2Sg1P+95Ymf+uwwrVyZaEjUMts5xejtW9A


LEMAITRE3
This cluster has been decommissioned in 2024.
Lemaitre3 comes to replace Lemaitre2. It is hosted at Université catholique de Louvain (CISM). It features 78 compute nodes with two 12-cores Intel SkyLake 5118 processors at 2.3 GHz and 95 GB of RAM (3970MB/core), interconnected with an OmniPath network (OPA-56Gbps), and having exclusive access to a fast 440 TB BeeGFS parallel filesystem.
Suitable for:
Massively parallel jobs (MPI, several dozens of cores) with many communications and/or a lot of parallel disk I/O, 2 days max.
Resources
- Home directory (100G quota per user)
- Working directory /scratch ($GLOBALSCRATCH)
- Nodes have access to internet
- Max 100 running jobs per user
- Default queue* (max 2 days walltime per job, SkyLake processors) and debug queue (max 6 hours, Haswell processors)
Access/Support:
SSH to lemaitre3.cism.ucl.ac.be
(port 22) with the appropriate login and
id_rsa.ceci file.
SUPPORT: egs-cism@listes.uclouvain.be
Server SSH key fingerprints: (What's this?)
- ECDSA:
SHA256:1Z6M2WISLylvdH9gD8vHqJ9Z7bCDdJ03avlEXO9BKsc
- ED25519:
SHA256:63mf1cm89YoPvZnpVnUXn4JjNiIpafSCfuXG+Z/LzrI
- RSA:
SHA256:eWHb7N10/Wn+sdG2ED8NqudyZ2kcWTiR33BCq2PKD7Y


NIC4
New CÉCI accounts are no more created on NIC4, and existing accounts are no more automatically renewed. Existing users are strongly encouraged to backup their important data (/home and /scratch), delete unneeded ones, and migrate to NIC5.
Hosted at the University of Liège (SEGI facility), it features 128 compute nodes with two 8-cores Intel E5-2650 processors at 2.0 GHz and 64 GB of RAM (4 GB/core), interconnected with a QDR Infiniband network, and having exclusive access to a fast 144 TB FHGFS parallel filesystem.
Suitable for:
Massively parallel jobs (MPI, several dozens of cores) with many communications and/or a lot of parallel disk I/O, 3 days max.
Resources
- Home directory (20 GB quota per user)
- Working directory /scratch ($GLOBALSCRATCH)
- Nodes have access to internet
- Default queue* (3 days, 448 cores max per user, 64 jobs max per user, among which max 32 running, 256 CPUs max per job)
Access/Support:
SSH to login-nic4.segi.ulg.ac.be
(port 22) from your CECI gateway with the appropriate login and
id_rsa.ceci file.
FAQ: https://www.campus.uliege.be/nic4
SUPPORT: CECI support form
Server SSH key fingerprint: (What's this?)
MD5:94:6c:d6:cc:f8:ca:b2:d0:79:38:3c:e9:d3:e3:a7:6f
SHA256:5mQYQTjeW1XVYDFhIfMaGyFEJiTen56r2Kyz5ocj72I


VEGA
This cluster has been decommissioned in October 2020.
Hosted at the University of Brussels, it features 14 fat compute nodes with 64 cores (four 16-cores AMD Bulldozer 6272 processors at 2.1 GHz) and 256 GB of RAM, interconnected with a QDR Infiniband network, and 70 TB of high performance GPFS storage.
Suitable for:
Many-cores (SMP and MPI) and many single core jobs, 14 days max.
Resources
- Home/Working directory /home ($GLOBALSCRATCH=$HOME, 200GB quota)
- Nodes have access to internet
- Def queue* (Max 14 days, 400 cpus/user, 350 running jobs/user, 1000 jobs in queue per user)


HERCULES
This cluster has been decommissioned in August 2019.
Hosted at the University of Namur, this system currently consists of 512 cores spread across 32 Intel Sandy Bridge compute nodes, each with two 8-core E5-2660 processors at 2.2 GHz and 64 or 128 GB of RAM (8 nodes). All the nodes are interconnected by a Gigabit Ethernet network and have access to three NFS file systems for a total capacity of 100 TB.
Suitable for:
Long (max. 63 days) shared-memory parallel jobs (OpenMP or Pthreads), or resource-intensive sequential jobs.
Resources
- Home directory (200 GB quota per user)
- Working directory /workdir (400 GB per user) ($WORKDIR)
- Local working directory /scratch ($TMPDIR) dynamically defined in jobs
- No internet access from nodes
- cpu queue* (Max 63 days, 48 cpus/user)


DRAGON1
Hosted at the University of Mons, this cluster is made of 28 computing nodes, 26 computing nodes with two Intel Sandy Bridge (2 x 8-cores E5-2670 processors at 2.6 GHz) and 2 computing nodes with Intel Sandy Bridge (2 x 8-cores E5-2650 processors at 2.00GHz), 128 GB of RAM and 1.1 TB of local scratch disk space. The compute nodes are interconnected with a Gigabit Ethernet network (10 Gigabit for the 36 TB NFS file server). Two of those compute nodes have 2 x Tesla M2075 GPU (512Gflops float64) each one and two of those compute nodes have 2 x Tesla Kepler K20m (1.1 Tflops float64) each one.
Suitable for:
Long (max. 41 days) shared-memory parallel jobs (OpenMP or Pthreads), or resource-intensive (cpu speed and memory) sequential jobs.
Resources
- Home directory (20GB quota per user)
- Local working directory /scratch ($LOCALSCRATCH)
- No internet access from nodes
- Long queue*: long (Max 41 days, 40 cpus/user, 500 jobs/user)
- Def queue*: batch (Max 5 days, 40 cpus/user, 500 jobs/user)
- Generic resource*: gpu (Max 15 days, gres=gpu:kepler:1 or gres=gpu:tesla:1)
- Generic resource*: lgpu (Max 21 days gres=gpu:1)
Access/Support:
SSH to dragon1.umons.ac.be (port 22)
with the appropriate login and
id_rsa.ceci file.
FAQ: http://dragon1.umons.ac.be/
SUPPORT: CECI Support form
Server SSH key fingerprint: (What's this?) MD5: 2e:98:38:cf:99:68:89:2c:1f:6a:0e:19:fb:3b:02:d1 SHA256: dbPE5/40W2M7mF7B+pc4pSo00/bqYwuv4QycU5yv+IQ


LEMAITRE2
This cluster has been decommissioned in July 2018.
Hosted at Université catholique de Louvain, it comprises 112 compute nodes with two 6-cores Intel E5649 processors at 2.53 GHz and 48 GB of RAM (4 GB/core). The cluster has exclusive access to a fast 120 TB Lustre parallel filesystem. All compute nodes and management (NFS, Lustre, Frontend, etc.) are interconnected with a fast QDR Infiniband network.
Suitable for:
Massively parallel jobs (MPI, several dozens of cores) with many communications and/or a lot of parallel disk I/O, 3 days max.
Resources


HMEM
This cluster has been decommissioned in July 2020.
Hosted at the Université catholique de Louvain, it mainly comprises 12 fatnodes with 48 cores (four 12-cores AMD Opteron 6174 processors at 2.2 GHz). 2 nodes have 512 GB of RAM, 7 nodes have 256 GB and 3 nodes have 128 GB. All the nodes are interconnected with a fast Infiniband QDR network and have a 1.7 TB fast RAID setup for scratch disk space. All the local disks are furthermore gathered in a a global 12TB BeeGFS filesystem.
Suitable for:
Large shared-memory jobs (100+GB of RAM and 24+ cores), 15 days max.
Resources


LUCIA
Hosted at, and operated by, Cenaero, it features a total of 38.400 cores (AMD Milan) with up to 512 GB of RAM, 200 nVIDIA Tesla A100 GPUs, interconnected with a HDR Infiniband network, and having access to a fast 2.5PB GPFS (Spectrum Scale) parallel filesystem.
Suitable for:
Massively parallel jobs (MPI, several hundreds cores) with many communications and/or a lot of parallel disk I/O, 2 days max.
Resources
- Home directory (200 GB quota per user)
- Working directory
/gpfs/scratch
- Project directory
/gpfs/projects
- Batch queue + GPU queue (whole node allocation)
Access/Support:
SSH to frontal.lucia.cenaero.be
(port 22) with the appropriate login and
id_rsa.ceci
file, from a CÉCI SSH gateway.
ABOUT: tier1.cenaero.be
DOC: https://doc.lucia.cenaero.be/overview/
GETTING ACCES: FAQ
CREATE A TIER-1 PROJECT : How to create a Tier-1 project
SUPPORT: https://support.lucia.cenaero.be
Server SSH key fingerprint: (What's this?)
ED25519: SHA256:iO2HH1V1uHUGMEEj2yvSx2TfVUNhUwqdtqdIi31jxEA
ECDSA: SHA256:a5Zv6m0RJsJR4CLDmva2RrUWQea+aUC3/RWyeLYJPdg


ZENOBE
Hosted at, and operated by, Cenaero, it features a total of 13.536 cores (Haswell and Ivybridge) with up to 64 GB of RAM, interconnected with a QDR/FDR mixed Infiniband network, and having access to a fast 350 TB GPFS parallel filesystem.
Suitable for:
Massively parallel jobs (MPI, several hundreds cores) with many communications and/or a lot of parallel disk I/O, 1 day max.
Resources
- Home directory (50 GB quota per user)
- Working directory /SCRATCH
- Project directory /projects
- Large queue (1 day max walltime, 96 CPUs minimum and 4320 CPUs maximum per jobs, whole node allocation)
- Default queue (no time limit but jobs must be restartable)
Access/Support:
SSH to zenobe.hpc.cenaero.be
(port 22) with the appropriate login and
id_rsa.ceci file.
QUICKSTART: www.ceci-hpc.be/zenobe.html
DOC: tier1.cenaero.be/en/faq-page
ABOUT tier1.cenaero.be
SUPPORT: it@cenaero.be
Server SSH key fingerprint: (What's this?)
MD5: 47:b1:ab:3a:f7:76:48:05:44:d9:15:f7:2b:42:b7:30
SHA256: 8shVbcnKHt861M4Duwcxpgug6l8mjj+KZu/lmYyYgpY

