Building a Raspberry Pi 5 HPC Cluster with Slurm

Created: 2025-03-22 17:08:38 | Last updated: 2025-03-22 17:08:38 | Status: Public

This tutorial guides you through setting up a small High-Performance Computing (HPC) cluster using 3 Raspberry Pi 5 devices with 8GB RAM each, running Debian Bookworm, and connected via a Gigabit switch.

Table of Contents

Prerequisites

Hardware Requirements:
- 3× Raspberry Pi 5 (8GB RAM)
- 3× Power supplies (30W USB-C recommended)
- 3× microSD cards (32GB+ recommended)
- 1× Gigabit Ethernet switch
- Ethernet cables
- Optional: USB SSD for shared storage

Software Requirements:
- Debian Bookworm OS
- SSH enabled on all nodes
- Basic Linux knowledge

Cluster Architecture

Our cluster will consist of:
- 1 head node (controller + compute capability)
- 2 compute nodes

graph TD Switch[Gigabit Switch] HeadNode[Head Node
pi-head
192.168.1.20] Compute1[Compute Node 1
pi-compute01
192.168.1.21] Compute2[Compute Node 2
pi-compute02
192.168.1.22] HeadNode --- Switch Compute1 --- Switch Compute2 --- Switch HeadNode --> |slurmctld
NFS Server| HeadNode Compute1 --> |slurmd
NFS Client| Compute1 Compute2 --> |slurmd
NFS Client| Compute2

Initial Setup

1. Prepare the OS

For each Raspberry Pi:

  1. Flash Debian Bookworm to each microSD card
  2. Boot each Pi and complete initial setup
  3. Update the system:
sudo apt update
sudo apt upgrade -y
  1. Install essential packages:
sudo apt install -y vim git htop ntp build-essential

2. Configure Hostname and Hosts Files

For the head node:

sudo hostnamectl set-hostname pi-head

For compute nodes:

# On first compute node
sudo hostnamectl set-hostname pi-compute01

# On second compute node
sudo hostnamectl set-hostname pi-compute02

Edit /etc/hosts on each node to include all nodes:

sudo nano /etc/hosts

Add the following lines:

192.168.1.20 pi-head
192.168.1.21 pi-compute01
192.168.1.22 pi-compute02

Network Configuration

1. Set Static IP Addresses

Note: We set static IPs on our OpenWRT router beforehand. If you’re using a different approach, ensure all nodes have the correct static IPs (192.168.1.20 for head node, 192.168.1.21 for compute01, 192.168.1.22 for compute02).

2. Configure SSH Key Authentication

On the head node, generate SSH keys:

ssh-keygen -t ed25519 -C "cluster-key"

Copy the key to each compute node:

ssh-copy-id pi@pi-compute01
ssh-copy-id pi@pi-compute02

Shared Storage with NFS

1. Set Up NFS Server (Head Node)

Install NFS server:

sudo apt install -y nfs-kernel-server

Create a shared directory:

sudo mkdir -p /shared
sudo chmod 777 /shared  # For tutorial purposes; use proper permissions in production

Configure exports:

sudo nano /etc/exports

Add the following:

/shared 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash)
/home   192.168.1.0/24(rw,sync,no_subtree_check)

Apply the configuration:

sudo exportfs -a
sudo systemctl restart nfs-kernel-server

2. Set Up NFS Clients (Compute Nodes)

On each compute node:

sudo apt install -y nfs-common
sudo mkdir -p /shared

# Add mount entries to fstab
sudo nano /etc/fstab

Add these lines:

pi-head:/shared /shared nfs defaults 0 0
pi-head:/home   /home   nfs defaults 0 0

After modifying fstab, reload the daemon and mount the shares:

sudo systemctl daemon-reload
sudo mount -a

User Management

1. Create Cluster User

Create a consistent user on all nodes (will be synchronized via NFS home directory):

# On head node only
sudo adduser hpcuser
sudo usermod -aG sudo hpcuser

2. Test User Accessibility on Compute Nodes

To test that the hpcuser is accessible on compute nodes after NFS home is mounted:

# On the head node, create a test file in hpcuser's home directory
sudo -u hpcuser touch /home/hpcuser/test_file

# SSH to a compute node
ssh pi-compute01

# Check if the test file exists and is accessible
ls -la /home/hpcuser/test_file

# Try to switch to the hpcuser account
su - hpcuser

# Verify you can create files as this user
touch ~/test_from_compute
exit

# Return to head node and verify the file is visible
ssh pi-head
ls -la /home/hpcuser/test_from_compute

If all tests pass, your NFS home directory and user setup are working correctly.

Installing Slurm

1. Install Dependencies (All Nodes)

On all nodes:

sudo apt install -y slurmd slurm-client munge libmunge-dev

On the head node also install:

sudo apt install -y slurmctld slurm-wlm-basic-plugins

2. Configure Munge Authentication (All Nodes)

On the head node:

# Create a munge key
sudo /usr/sbin/create-munge-key -r
sudo systemctl enable munge
sudo systemctl start munge

# Copy the key to a location accessible via NFS
sudo cp /etc/munge/munge.key /shared/
sudo chmod 400 /shared/munge.key

On compute nodes:

# Stop munge if running
sudo systemctl stop munge

# Copy the key
sudo cp /shared/munge.key /etc/munge/
sudo chown munge:munge /etc/munge/munge.key
sudo chmod 400 /etc/munge/munge.key

# Restart munge
sudo systemctl enable munge
sudo systemctl start munge

Test munge on all nodes:

munge -n | unmunge

Slurm Configuration

1. Create Slurm Configuration File

On the head node, create the configuration:

sudo nano /etc/slurm/slurm.conf

Use this base configuration (adjust as needed):

# slurm.conf
ClusterName=pi-cluster
SlurmctldHost=pi-head

# Authentication/security
AuthType=auth/munge
CryptoType=crypto/munge
MpiDefault=none

# Scheduling
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_Core

# Performance
SlurmctldDebug=info
SlurmdDebug=info
JobAcctGatherType=jobacct_gather/none
SlurmctldLogFile=/var/log/slurm/slurmctld.log
SlurmdLogFile=/var/log/slurm/slurmd.log

# Process management
ProctrackType=proctrack/linuxproc
TaskPlugin=task/none

# Node configurations
NodeName=pi-head CPUs=4 RealMemory=7000 State=UNKNOWN
NodeName=pi-compute01 CPUs=4 RealMemory=7000 State=UNKNOWN
NodeName=pi-compute02 CPUs=4 RealMemory=7000 State=UNKNOWN

# Partition configuration
PartitionName=main Nodes=pi-head,pi-compute01,pi-compute02 Default=YES MaxTime=INFINITE State=UP

Create log directories:

sudo mkdir -p /var/log/slurm
sudo chown slurm:slurm /var/log/slurm

2. Distribute Configuration

Copy to all nodes:

sudo cp /etc/slurm/slurm.conf /shared/

On compute nodes:

sudo cp /shared/slurm.conf /etc/slurm/

3. Start Slurm Services

On the head node:

sudo systemctl enable slurmctld
sudo systemctl start slurmctld

On compute nodes:

sudo systemctl enable slurmd
sudo systemctl start slurmd

Testing Your Cluster

1. Check Cluster Status

On the head node:

sinfo

You should see all nodes in your cluster listed.

2. Run a Test Job

Create a test job script:

nano ~/test_job.sh

Add the following:

#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=test_%j.out
#SBATCH --error=test_%j.err
#SBATCH --ntasks=4
#SBATCH --time=00:05:00

hostname
sleep 10
echo "This is a test job running on $(hostname)"
srun hostname

Make it executable:

chmod +x ~/test_job.sh

Submit the job:

sbatch ~/test_job.sh

Check job status:

squeue

3. Run a Simple MPI Job

Install MPI:

sudo apt install -y openmpi-bin libopenmpi-dev

Create an MPI test program:

nano ~/mpi_hello.c

Add the following:

#include <mpi.h>
#include <stdio.h>
#include <unistd.h>

int main(int argc, char** argv) {
    int world_size, world_rank;
    char hostname[256];

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    gethostname(hostname, sizeof(hostname));

    printf("Hello from processor %s, rank %d out of %d processors\n",
           hostname, world_rank, world_size);

    MPI_Finalize();
    return 0;
}

Compile:

mpicc -o mpi_hello mpi_hello.c

Create a submission script:

nano ~/mpi_job.sh

Add the following:

#!/bin/bash
#SBATCH --job-name=mpi_test
#SBATCH --output=mpi_%j.out
#SBATCH --error=mpi_%j.err
#SBATCH --nodes=2
#SBATCH --ntasks=8
#SBATCH --time=00:05:00

module load mpi/openmpi  # If using environment modules

srun --mpi=pmix_v3 ./mpi_hello

Submit the job:

sbatch ~/mpi_job.sh

Advanced Configuration

1. Setting Up Environment Modules

Install environment modules:

sudo apt install -y environment-modules

Create module files in /shared/modulefiles.

2. Job Accounting

For simple job accounting:

# On head node
sudo apt install -y slurm-wlm-basic-plugins

Edit slurm.conf to enable:

JobAcctGatherType=jobacct_gather/linux
AccountingStorageType=accounting_storage/filetxt
AccountingStorageLoc=/var/log/slurm/accounting

3. Creating Administrative Scripts

Example script to check node status (~/bin/check_nodes.sh):

#!/bin/bash
echo "============== CLUSTER STATUS =============="
sinfo
echo
echo "============== NODE DETAILS ================"
scontrol show nodes
echo
echo "============== QUEUE STATUS ================"
squeue

Troubleshooting

Common Issues and Solutions

  1. Nodes showing DOWN state:
   # Check slurmd logs
   sudo systemctl status slurmd
   cat /var/log/slurm/slurmd.log

   # Restart slurmd
   sudo systemctl restart slurmd
  1. Jobs stuck in PENDING state:
   # Check reason
   scontrol show job <job_id>

   # Check configuration
   scontrol show partition
  1. Authentication failures:
   # Test munge
   munge -n | ssh pi-compute01 unmunge

   # Restart munge on all nodes
   sudo systemctl restart munge
  1. NFS issues:
   # Check mounts
   df -h

   # Remount if needed
   sudo systemctl daemon-reload
   sudo mount -a

This tutorial provides the foundation for your Raspberry Pi 5 HPC cluster. From here, you can expand by adding more nodes, configuring GPU resources if available, implementing more sophisticated job scheduling policies, or adding monitoring tools like Ganglia or Prometheus.

Happy cluster computing!