Maguire

Maguire is a HPC cluster available to some Trinity Researchers. Funding was provided by Research Ireland.

Maguire contains 3 compute nodes with the following characteristics.

8 NVIDIA H200 GPU's.
96 cores, 2 by 48 core Intel Xeon Platinum 8468 CPU sockets.
2TB of RAM.
8 3.5TB NVME scratch disks.
8 Mellanox ConnectX-7 InfiniBand adaptors.

To login to Maguire you use your Trinity computer account credentials.

To request access to Maguire please support Research IT Support: rit-support@tcd.ie. Research IT will request approval from the authorising parties for access requests.

Maguire is accessible from the College network, including the VPN. To login please connect to maguire01.tchpc.tcd.ie using the usual SSH instructions, your Trinity computer account username in the format username@COLLEGE.TCD.IE and your Trinity computer account password.

ssh -l usernam@COLLEGE.TCD.ie maguire01.tchpc.tcd.ie.

Ensure to replace username with your Trinity computer account username.

Software is installed with our usual modules system. You can view the available software with module av and load software with the module load ... command.

Running jobs must be done via the Slurm scheduler.

Batch job example parameters:

#!/bin/bash
#SBATCH -N 1
#SBATCH --gpus=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32GB
#SBATCH --time=01:00:00

Interactive allocation examples.

salloc -N 1 --gpus=1 --cpus-per-task=8 --mem=32G --time=01:00:00
srun --jobid=$SLURM_JOBID --pty bash

See the HPC clusters usage documentation for further instructions.

Caveats and Warnings

Maguire is still in development and is not in stable service. It may be changed, taken off line or have access revoked for all or individual's with access without notice. Running jobs may be cancelled at any time without warning.

The file system is not backed up. It should not be relied upon to save data. Data saved to Maguire may be lost and should be saved elsewhere.

Not all services expected to be offered by the Cluster are available.

Software Stack

Multiple cuda versions

Status: Tested The gpu nodes have now (as of the 21-11-2025) cuda 13.0 installed by default (not sure if this is the base cuda SDK or the HPC Toolkit)

The cuda SDK version 12.9 can also be accessed in the gpu nodes with the usual set up in either the command line or setting up the .bashrc :

                export PATH=$PATH:/usr/local/cuda-12.9/bin
                export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-12.9/lib64

Status: Installed but untested Other versions we can easily install with spack (py-torch already installs 12.9.1)

cuda 12.2 runtime

Status: Installed but untested Has been installed as requested. To use, you can either use:

                $ module load cuda/12.2.2-none-none-soeglmb

Or:

                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load cuda@12.2.2

cuda 12.8 runtime

Status: Installed but untested Has been installed as requested. Instructions to use:

                $ module load cuda/12.8.1-none-none-bgo3i4m

Or:

                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load cuda@12.8.1

cuda 12.9 runtime

Status: Installed but untested Has been installed as well. Instructions to use:

                $ module load cuda/12.9.1-none-none-hhy4nk4

Or:

                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load cuda@12.9.1

nvidia-smi

Status: Tested nvidia-smi is not part of a cuda runtime but comes with the latest system wide install: Driver Version: 580.82.07 CUDA Version: 13.0

Intel Compilers

Status: Installed but untested We have installed intel oneapi 2025.2.1. Instructions to use:


                $ . /lustre/disk/home/support/intel/oneapi-2025.2.1.044/setvars.sh

Python

Status: Tested

The system comes with python3.11 and python3.9. On top of that:

Python 3.12.12

Status: Installed but untested

Python 3.12.12 has been installed as requested by Yuelai Xin. Instructions to use:


                $ module load python/3.12.12-gcc-11.5.0-6kjbece

Or:


                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load python@3.12.12

Python 3.8

Status: Installed but untested

This is difficult to set up because 3.8 is fairly old and modern compilers refuse to compile it. The best way to do it was to create a conda environment with it:

                $ module load miniconda3/25.5.1-none-none-kg3fpfc

Or:

                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load miniconda3

Then:


                $ conda create -n python38-test-env python=3.8
                $ conda activate python38-test-env

R

Status: Installed but untested

If earlier versions are needed, we should be able to install them as well.


                $ module load r/4.5.1-gcc-11.5.0-n5cpcbq

Or:


                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load r@4.5.1

Pytorch

Status: Installed but untested

The recommended way of installing pytorch is by installing and running it with conda. It is actually installed as a dependency of torchvision (see below).


                $ module load miniconda3/25.5.1-none-none-kg3fpfc

Or:


                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load miniconda3

Status: Installed but untested

We have also installed 2.9.0 system wide. It can be loaded in the following way:


                $ module load py-torch/2.9.0-gcc-11.5.0-onbtsjm

Or:


                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load py-torch@2.9.0

There is a potential problem with this install at the moment: By default it uses openmpi 5.0.9. This needs slurm to use pmix to run in multiple nodes; at the moment HPE added pmix. Once they upgrade SLURM from 24.11.6 to slurm 25.05.4 and add pmix, that should fix it and we should be able to run pytorch across multiple nodes without problems.

Tensorflow

Status: Installed but untested

We have installed tensorflow 2.20.0 on miniconda - this seems to be the way recommended suggested in the Tensorflow documentation

                $ module load miniconda3/25.5.1-none-none-kg3fpfc

Or:

                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load miniconda3

Still working on a system wide install, spack seems to be freezing when I try to spack install it with cuda support.

Recommended version is 2.20.0, but installing earlier versions with spack should be easy if we get it to work

There are a few other packages that we could also easily install if they are needed: py-tensorflow-datasets, py-tensorflow-estimator, py-tensorflow-hub, py-tensorflow-metadata and py-tensorflow-probability

llama.cpp

CPU version Status: Installed but untested

Downloaded the source code and compiled. Left a copy in:


                    /lustre/disk/home/support/apps/llama.cpp/

The binaries are in:


                    /lustre/disk/home/support/apps/llama.cpp/build/bin/

Once the user has confirmed that it is working properly, we can build a module for easier usage.

Surprisingly enough, the default version od llama.cpp does not seem to use the GPU - I'll try to compile additional versions so the users can run either of them. The GPU version will probably be the fastest, but having the option of running on the CPU if the GPUs are being used could be useful. We can keep multiple builds in different folders in /home/support/support/apps/llama.cpp-version/

GPU version Status: Installed but untested

Downloaded the source code and compiled. Left a copy in:


                    /lustre/disk/home/support/apps/llama.cpp_-_CUDA/

Docker

Status: Installed but untested

Seán has installed Apptainer apptainer usage

Conda

Status: Tested

The system wide of miniconda worked, see below.


                $ module load miniconda3/25.5.1-none-none-kg3fpfc

Or:


                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load miniconda3

To be installed:

We could also set a system wide install of anaconda that all users get by default (as in automatically set up their .bashrc).

There could be an issue with Conda's Terms of Service? "Use of Anaconda’s offerings at an organization of more than 200 employees/contractors requires a paid business license unless your organization is eligible for discounted or free use"

Conda packages

I installed a system wide miniconda and tested that I could pip or conda install the following packages:


                $ . /lustre/disk/home/support/spack/1.1.0/share/spack/setup-env.sh
                $ spack load miniconda3

The list of installed packages can be seen with


                $ conda list

Jupiter Notebooks

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user jupyterlab

Then:

Run jupyter in maguire01: jupyter notebook --ip 0.0.0.0 --port=8888

Port forward, from your own computer run: ssh -N -L 8080:maguire01.tchpc.tcd.ie:8888 UESRNAME@maguire01.tchpc.tcd.ie

Note: ensure to replace USERNAME with your usename

Point your browser to:


http://127.0.0.1:8080/

To quit jupyter notebooks: 1) on astro: CTRL C in the terminal running it 2) close your browser 3) quit the port forwarding ssh session, CTRL C in that terminal or closing it should do.

Pytorch lightning

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user lightning

Torchvision

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user torchvision

Hugging Face Libraries

Status: Installed but untested

Users should be able to install their preferred version via pip with dependencies:


                $ pip install --user 'huggingface_hub[cli,torch]'

Numpy

Status: Installed but untested

Users should be able to install their preferred version via conda:


                $ pip install --user numpy

SciPy

Status: Installed but untested

Installed but untested

Users should be able to install their preferred version via conda:


                $ pip install --user scipy

Pandas

Status: Installed but untested

Users should be able to install their preferred version via conda:


                $ pip install --user pandas

xgboost

Status: Installed but untested

Users should be able to install their preferred xgboost version by running:


                $ pip install --user xgboost

cudaML

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user "cudf-cu13==25.10.*" "cuml-cu13==25.10.*"

Braindecode

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user braindecode

Braindecode can also be installed along MOABB to download open datasets:


                $ pip install --user moabb

Skorch

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user -U skorch

And it turned out it was already installed by a previous package! It seems to be a dependency for braindecode

Captum

Status: Not Installed

Not installed, left to the users to install in their own environments. They should be able to install their preferred version via pip:


                $ pip install --user captum

Captum seems to have dependencies that differenciate it from other packages. It requires numpy-1.26.4 while other packages use numpy-2.2.6.dist-info. Hence, we recommend for the user to install it in a different conda environment than other packages to prevent incompatibilities with numpy

MNE

Status: Installed but untested

It is installed as part of numpy:

                  >>> import numpy as np
                  >>> import mne

scikit-learn

Status: Installed but untested


            $ pip3 install --user -U scikit-learn

pyedflib

Status: Installed but untested

Users should be able to install their preferred version via pip or conda


                $ pip install --user pyEDFlib

Matplotlib

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user matplotlib

When I tried, it turned out that it was already installed by another package.

Seaborn

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user seaborn

When I tried, it turned out that it was already installed by another package.

Joblib

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user joblib

When I tried, it turned out that it was already installed by another package.

MLFlow

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user mlflow

When I tried, it turned out that it was already installed by another package.

HDF5 - python

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user h5py

When I tried, it turned out that it was already installed by another package!

Ollama

Status: Installed but untested

Users should be able to install their preferred version via pip:


                $ pip install --user ollama

TO DO

VS code

Status: To be installed

There are deb, rpm, tar anc cli files; this looks like it will have to be handled by the sysadmins