Tutorial: Setting Up a 3-Node Raspberry Pi 5 SLURM Cluster (Rev3)

Created: 2025-04-13 18:59:37 | Last updated: 2025-04-13 18:59:37 | Status: Public

This tutorial guides you through setting up a small High-Performance Computing (HPC) cluster using three Raspberry Pi 5 devices, SLURM Workload Manager, and a specific network configuration involving both Wi-Fi and a private Ethernet network.

Cluster Configuration:

  • Nodes: 3 x Raspberry Pi 5 (8GB RAM recommended)
  • OS: Raspberry Pi OS Bookworm (64-bit recommended)
  • Boot: From SSDs
  • Cluster User: cuser
  • Networking:
    • pi-head:
      • WLAN (wlan0): Connects to your main router via Wi-Fi, gets 192.168.1.20 via DHCP reservation (Gateway: 192.168.1.1). Provides internet access.
      • Ethernet (eth0): Connects to private switch, static IP 10.0.0.1/24.
    • pi-c01:
      • Ethernet (eth0): Connects to private switch, static IP 10.0.0.2/24. Gateway via pi-head (10.0.0.1).
    • pi-c02:
      • Ethernet (eth0): Connects to private switch, static IP 10.0.0.3/24. Gateway via pi-head (10.0.0.1).
  • SLURM: Basic setup (slurmctld, slurmd, munge).

Prerequisites

  1. Hardware:
    • 3 x Raspberry Pi 5 (8GB RAM)
    • 3 x NVMe SSDs (or SATA SSDs with appropriate adapters) compatible with RPi 5 boot.
    • 3 x Reliable Power Supplies for RPi 5 (5V/5A recommended).
    • 1 x Gigabit Ethernet Switch (unmanaged is fine).
    • 3 x Ethernet Cables.
    • Access to your existing Wi-Fi network and router admin interface (for DHCP reservation).
  2. Software:
    • Raspberry Pi Imager tool.
    • Raspberry Pi OS Bookworm (64-bit recommended) flashed onto each SSD.
  3. Initial Setup:
    • Ensure each Pi boots correctly from its SSD.
    • Complete the initial Raspberry Pi OS setup wizard (create the initial user - this is NOT cuser yet, set locale, keyboard, etc.).
    • Enable SSH on each Pi: sudo raspi-config -> Interface Options -> SSH -> Enable.
    • Connect pi-head to your Wi-Fi network.
    • Configure the DHCP reservation on your OpenWRT router to assign 192.168.1.20 to pi-head’s WLAN MAC address. Verify pi-head gets this IP (ip a show wlan0).
    • Physically connect all three Pis to the Gigabit switch using Ethernet cables.

Phase 1: Basic OS Configuration & Hostnames

(Perform these steps on each Pi, adjusting hostnames accordingly. You’ll need SSH access.)

  1. Login: SSH into each Pi using the initial user you created during setup.
  2. Set Hostnames:
    • On the first Pi (intended as head node):
        sudo hostnamectl set-hostname pi-head
*   On the second Pi (compute node 1):
        sudo hostnamectl set-hostname pi-c01
*   On the third Pi (compute node 2):
        sudo hostnamectl set-hostname pi-c02
*   Reboot each Pi (`sudo reboot`) or log out and log back in for the change to take effect in your shell prompt and network identity.
  1. Update System (pi-head only for now):
    • Ensure pi-head has internet via Wi-Fi.
    • SSH into pi-head:
        sudo apt update
        sudo apt full-upgrade -y
        sudo apt install -y vim git build-essential # Essential tools
*   *Note: We will update `pi-c01` and `pi-c02` after setting up network routing.*

Phase 2: Network Configuration (Revised)

Goal: Configure network interfaces on all three Raspberry Pis. pi-head will connect to your home network/internet via Wi-Fi (wlan0) and to the private cluster network via Ethernet (eth0). pi-c01 and pi-c02 will connect only to the private cluster network via Ethernet (eth0) and use pi-head as their gateway to reach the internet. We will use nmcli for interface configuration and nftables for firewall/NAT on pi-head.

Recap of Target Configuration:

  • pi-head:
    • wlan0: 192.168.1.20 (via DHCP reservation), Gateway 192.168.1.1 (Internet Access)
    • eth0: 10.0.0.1/24 (Static, Private Network)
  • pi-c01:
    • eth0: 10.0.0.2/24 (Static, Private Network), Gateway 10.0.0.1
    • wlan0: Disabled
  • pi-c02:
    • eth0: 10.0.0.3/24 (Static, Private Network), Gateway 10.0.0.1
    • wlan0: Disabled

Steps:

  1. Verify pi-head WLAN Connection:
    • SSH into pi-head.
    • Confirm it received the correct IP address from your router’s DHCP reservation and has a default route via your main gateway:
        ip addr show wlan0
        # Look for 'inet 192.168.1.20/XX ...' (XX is your subnet mask, often 24)

        ip route show default
        # Should show 'default via 192.168.1.1 dev wlan0 ...'
*   If the IP or route is incorrect, double-check your router's DHCP reservation settings and ensure `pi-head`'s Wi-Fi is connected to the correct network.
  1. Configure pi-head Ethernet (eth0 - Private Network):
    • Still on pi-head.
    • Identify the Ethernet interface name (usually eth0): ip a
    • Add a NetworkManager connection profile for eth0 with the static private IP. We explicitly set no gateway and mark it as never the default route:
        # Replace 'eth0' if your interface name is different
        sudo nmcli connection add type ethernet con-name 'static-eth0' ifname eth0 ip4 10.0.0.1/24

        # Critical: Prevent this interface from ever becoming the default route
        sudo nmcli connection modify 'static-eth0' ipv4.gateway '' # Ensure no gateway is set
        sudo nmcli connection modify 'static-eth0' ipv4.never-default yes # Prevent it from being the default route
        sudo nmcli connection modify 'static-eth0' connection.autoconnect yes # Connect automatically

        # Bring the connection up (may happen automatically)
        sudo nmcli connection up 'static-eth0'
*   **Verify `pi-head` Network State:**
        ip addr show eth0
        # Should show 'inet 10.0.0.1/24 ...'

        ip route show default
        # Should STILL show 'default via 192.168.1.1 dev wlan0 ...'
  1. Configure pi-c01 Ethernet (eth0 - Private Network):
    • SSH into pi-c01. (Use temporary keyboard/monitor or connect eth0 temporarily to main network if needed for first access).
    • Identify the Ethernet interface name (usually eth0): ip a
    • Add the static IP configuration, setting pi-head (10.0.0.1) as the gateway and providing DNS servers:
        # Replace 'eth0' if needed
        sudo nmcli connection add type ethernet con-name 'static-eth0' ifname eth0 ip4 10.0.0.2/24 gw4 10.0.0.1

        # Set DNS servers (e.g., Google DNS and Cloudflare DNS)
        # These requests will be routed via pi-head
        sudo nmcli connection modify 'static-eth0' ipv4.dns "8.8.8.8 1.1.1.1"
        sudo nmcli connection modify 'static-eth0' ipv4.ignore-auto-dns yes # Use only the specified DNS
        sudo nmcli connection modify 'static-eth0' connection.autoconnect yes

        # Bring the connection up
        sudo nmcli connection up 'static-eth0'
*   **Verify `pi-c01` Network State:**
        ip addr show eth0
        # Should show 'inet 10.0.0.2/24 ...'

        ip route show default
        # Should show 'default via 10.0.0.1 dev eth0 ...'

        cat /etc/resolv.conf
        # Should show 'nameserver 8.8.8.8' and 'nameserver 1.1.1.1'
  1. Configure pi-c02 Ethernet (eth0 - Private Network):
    • SSH into pi-c02.
    • Identify the Ethernet interface name (usually eth0): ip a
    • Add the static IP configuration, similar to pi-c01:
        # Replace 'eth0' if needed
        sudo nmcli connection add type ethernet con-name 'static-eth0' ifname eth0 ip4 10.0.0.3/24 gw4 10.0.0.1

        # Set DNS servers
        sudo nmcli connection modify 'static-eth0' ipv4.dns "8.8.8.8 1.1.1.1"
        sudo nmcli connection modify 'static-eth0' ipv4.ignore-auto-dns yes
        sudo nmcli connection modify 'static-eth0' connection.autoconnect yes

        # Bring the connection up
        sudo nmcli connection up 'static-eth0'
*   **Verify `pi-c02` Network State:**
        ip addr show eth0
        # Should show 'inet 10.0.0.3/24 ...'

        ip route show default
        # Should show 'default via 10.0.0.1 dev eth0 ...'

        cat /etc/resolv.conf
        # Should show nameservers 8.8.8.8 and 1.1.1.1
  1. Enable IP Forwarding and Configure nftables NAT/Firewall on pi-head:
    • SSH back into pi-head.
    • Enable kernel IP forwarding:
        echo 'net.ipv4.ip_forward=1' | sudo tee /etc/sysctl.d/99-ip_forward.conf
        sudo sysctl -p /etc/sysctl.d/99-ip_forward.conf # Apply immediately
        sudo sysctl net.ipv4.ip_forward # Verify output is '= 1'
*   **Install `nftables` (if not already present):**
        sudo apt update
        sudo apt install -y nftables
*   **Create `nftables` Configuration:**
    *   Backup the default config: `sudo cp /etc/nftables.conf /etc/nftables.conf.bak`
    *   Edit the configuration file: `sudo vim /etc/nftables.conf`
    *   **Replace the entire content** with this ruleset (adjust `wlan0`/`eth0` if needed):
            #!/usr/sbin/nft -f

            # Flush the entire previous ruleset
            flush ruleset

            # Table for IPv4 NAT
            table ip nat {
                chain postrouting {
                    type nat hook postrouting priority 100; policy accept;
                    # Masquerade traffic from private network (eth0) going OUT via wlan0
                    oifname "wlan0" ip saddr 10.0.0.0/24 masquerade comment "NAT cluster traffic to WAN"
                }
                # Optional: Add prerouting rules here if needed for port forwarding INTO the cluster
            }

            # Table for IPv4/IPv6 Filtering
            table inet filter {
                chain input {
                    type filter hook input priority 0; policy accept;
                    # Basic stateful firewall for input traffic to pi-head
                    ct state established,related accept
                    # Allow loopback traffic
                    iifname "lo" accept
                    # Allow SSH (port 22) - Recommended! Add source IP ranges if possible
                    tcp dport 22 accept
                    # Allow ICMP (ping)
                    icmp type echo-request accept
                    # Allow traffic from cluster nodes (optional, if needed for services hosted on pi-head)
                    iifname "eth0" ip saddr 10.0.0.0/24 accept comment "Allow traffic from cluster nodes"

                    # Uncomment below to drop other traffic instead of accept-all policy
                    # drop
                }

                chain forward {
                    type filter hook forward priority 0; policy drop; # Default: Drop forwarded traffic

                    # Allow established/related connections coming back IN from WAN (wlan0) to LAN (eth0)
                    iifname "wlan0" oifname "eth0" ct state related,established accept comment "Allow established WAN to LAN"

                    # Allow NEW and established connections going OUT from LAN (eth0) to WAN (wlan0)
                    iifname "eth0" oifname "wlan0" ip saddr 10.0.0.0/24 accept comment "Allow LAN to WAN"
                }

                chain output {
                    type filter hook output priority 0; policy accept;
                    # Basic stateful firewall for output traffic from pi-head
                    ct state established,related accept
                    # Allow loopback traffic
                    oifname "lo" accept

                    # Uncomment below to drop other traffic instead of accept-all policy
                    # drop
                }
            }
*   **Apply and Persist `nftables` Rules:**
        sudo nft -f /etc/nftables.conf # Apply the ruleset, check for errors
        sudo systemctl enable nftables.service # Make rules persistent on boot
        sudo systemctl restart nftables.service # Restart service to load rules definitively
        sudo systemctl status nftables.service # Check service status
        sudo nft list ruleset # Review the active ruleset
  1. Troubleshoot SSH Slowness on pi-head (Potential Fix):
    • Slow SSH logins are often caused by the SSH server trying to perform a reverse DNS lookup on the connecting client’s IP address, which can time out if not configured correctly.
    • On pi-head:
        # Edit the SSH server configuration file
        sudo vim /etc/ssh/sshd_config
*   Find the line `#UseDNS yes` or `UseDNS yes`. Uncomment it if needed, and change `yes` to `no`:
        UseDNS no
*   Save the file and restart the SSH service:
        sudo systemctl restart sshd
*   Try SSHing into `pi-head` again from your workstation. If the login is now significantly faster, this was likely the cause.
  1. Test Basic Network Connectivity:
    • From pi-head:
        ping -c 2 10.0.0.2 # Ping pi-c01
        ping -c 2 10.0.0.3 # Ping pi-c02
*   From `pi-c01`:
        ping -c 2 10.0.0.1 # Ping pi-head
        ping -c 2 10.0.0.3 # Ping pi-c02
*   From `pi-c02`:
        ping -c 2 10.0.0.1 # Ping pi-head
        ping -c 2 10.0.0.2 # Ping pi-c01
*   All these pings over the `10.0.0.x` network should work.
  1. Verify Compute Node Internet Access (Initial Test):
    • From pi-c01:
        ping -c 3 8.8.8.8      # Test internet IP reachability
        ping -c 3 google.com   # Test DNS resolution + internet reachability
*   From `pi-c02`:
        ping -c 3 1.1.1.1      # Test internet IP reachability (different target)
        ping -c 3 cloudflare.com # Test DNS resolution + internet reachability
*   These tests should now succeed, routing through `pi-head`. If not, re-check `nftables` rules (`sudo nft list ruleset`), IP forwarding (`sudo sysctl net.ipv4.ip_forward`), and routing tables (`ip route`) on all nodes.
  1. Disable Wi-Fi on Compute Nodes (pi-c01, pi-c02):
    • This confirms they rely solely on eth0 for all traffic.
    • On pi-c01:
        sudo nmcli radio wifi off
        nmcli radio wifi # Verify output shows 'disabled'
        ip a show wlan0 # Verify interface is DOWN or has no IP
*   **On `pi-c02`:**
        sudo nmcli radio wifi off
        nmcli radio wifi # Verify output shows 'disabled'
        ip a show wlan0 # Verify interface is DOWN or has no IP
  1. Final Connectivity Test (Compute Nodes via eth0 only):
    • Repeat the internet connectivity tests from Step 8 on both pi-c01 and pi-c02:
        # On pi-c01
        ping -c 3 8.8.8.8
        ping -c 3 google.com

        # On pi-c02
        ping -c 3 1.1.1.1
        ping -c 3 cloudflare.com
*   If these tests still succeed with Wi-Fi disabled, your network routing is configured correctly.
  1. Update Compute Nodes:
    • Now that pi-c01 and pi-c02 have verified internet access via pi-head, ensure they are fully updated:
        # On pi-c01 AND pi-c02
        sudo apt update
        sudo apt full-upgrade -y
        # Install common tools if you haven't already
        sudo apt install -y vim git build-essential

Phase 2 Completion: At this point, your network should be fully configured according to the requirements. pi-head acts as the gateway, and pi-c01/pi-c02 rely solely on their Ethernet connection to the private network for all communication, including internet access routed through pi-head. You can now proceed to Phase 3: Common Cluster Environment Setup.


Phase 3: Common Cluster Environment Setup

(Perform steps on all nodes unless specified)

  1. Configure Hostname Resolution (/etc/hosts):
    • Edit the hosts file on all three nodes: sudo vim /etc/hosts
    • Ensure the following lines exist (add them if missing, below the 127.0.0.1 localhost line):
        127.0.1.1       <current_hostname> # This line is usually added by hostnamectl

        # Cluster Nodes
        10.0.0.1    pi-head
        10.0.0.2    pi-c01
        10.0.0.3    pi-c02
*   **Test:** From any node, ping the others by hostname (e.g., `ping -c 1 pi-c01` from `pi-head`).
  1. Create Common Cluster User (cuser):
    • Crucially, cuser must have the same User ID (UID) and Group ID (GID) on all nodes.
    • On pi-head first:
        sudo adduser cuser
        # Follow prompts to set password etc.
        # Note the UID and GID displayed (e.g., uid=1001(cuser) gid=1001(cuser) groups=...)
        # Optional: Add cuser to the sudo group if needed for administration tasks
        # sudo usermod -aG sudo cuser
*   **On `pi-c01` and `pi-c02`:**
    *   Get the UID and GID from `pi-head`. Use `id cuser` on `pi-head`. Let's assume it was `1001` for both UID and GID. **Replace `1001` below if yours is different.**
        # Create the group first with the specific GID
        sudo groupadd -g 1001 cuser
        # Create the user with the specific UID and GID
        sudo useradd -u 1001 -g 1001 -m -s /bin/bash cuser
        # Set the password for the new user
        sudo passwd cuser
        # Optional: Add to sudo group (use the same groups as on pi-head if needed)
        # sudo usermod -aG sudo cuser
*   **Verify:** Run `id cuser` on **all three** nodes. Ensure the UID and GID match exactly.
  1. Setup Passwordless SSH for cuser:
    • Log in as cuser on pi-head. You can use su - cuser if logged in as another user, or SSH directly: ssh cuser@pi-head.
    • Generate SSH key pair (run as cuser):
        # Accept default file location (~/.ssh/id_rsa), press Enter for empty passphrase
        ssh-keygen -t rsa -b 4096
*   **Copy the public key to all nodes (including `pi-head` itself):**
        # Run as cuser from pi-head
        ssh-copy-id cuser@pi-head
        ssh-copy-id cuser@pi-c01
        ssh-copy-id cuser@pi-c02
        # Enter the password for 'cuser' when prompted for each node
*   **Test:** Still as `cuser` on `pi-head`, try SSHing to each node without a password:
        ssh pi-head date
        ssh pi-c01 date
        ssh pi-c02 date
        # The first time connecting to each might ask "Are you sure you want to continue connecting (yes/no/[fingerprint])?". Type 'yes'.
        # If it prompts for a password after the first connection, the key setup failed. Check permissions in ~/.ssh directories.
  1. Install and Configure NFS (Shared Filesystem):
    • We’ll share /clusterfs from pi-head to be used by all nodes. Do this as the primary user, not cuser
    • On pi-head (NFS Server):
        sudo apt update
        sudo apt install -y nfs-kernel-server
        sudo mkdir -p /clusterfs
        # Option 1: Allow anyone to write (simple for cluster user)
        sudo chown nobody:nogroup /clusterfs
        sudo chmod 777 /clusterfs
        # Option 2: Restrict to cuser (better security, requires consistent UID/GID)
        # sudo chown cuser:cuser /clusterfs
        # sudo chmod 770 /clusterfs # Or 750 if group members only need read
        # Edit the NFS exports file
        sudo nano /etc/exports
        # Add this line to allow access from the private 10.0.0.x network:
        # Use 'no_root_squash' carefully if you need root access over NFS
        /clusterfs    10.0.0.0/24(rw,sync,no_subtree_check)
        # Activate the exports
        sudo exportfs -ra
        # Restart and enable the NFS server service
        sudo systemctl restart nfs-kernel-server
        sudo systemctl enable nfs-kernel-server
*   **On `pi-c01` and `pi-c02` (NFS Clients):**
        sudo apt update
        sudo apt install -y nfs-common
        sudo mkdir -p /clusterfs
        # Add the mount to /etc/fstab for automatic mounting on boot
        sudo nano /etc/fstab
        # Add this line at the end:
        pi-head:/clusterfs    /clusterfs   nfs    defaults,auto,nofail    0    0
        # Mount all filesystems defined in fstab (including the new one)
        sudo mount -a
        # Verify the mount was successful
        df -h | grep /clusterfs
        # Check mount options (optional)
        mount | grep /clusterfs
    *   From `pi-head` as `cuser`: `touch /clusterfs/test_head.txt`
    *   From `pi-c01` as `cuser`: `ls /clusterfs` (should see `test_head.txt`)
    *   From `pi-c02` as `cuser`: `touch /clusterfs/test_c02.txt`
    *   From `pi-head` as `cuser`: `ls /clusterfs` (should see both files)
  1. Install and Configure NTP (Time Synchronization): Accurate time is essential for SLURM.
    • Install chrony on all nodes:
        sudo apt update
        sudo apt install -y chrony
*   Ensure it's enabled and running on **all nodes**:
        sudo systemctl enable --now chrony
*   `chrony` will automatically use internet time sources. Since all nodes now have internet (directly or via `pi-head`), this should work.
*   **Verify sync status** (might take a minute or two after starting):
        # Run on all nodes
        chronyc sources
        # Look for lines starting with '^*' (synced server) or '^+ (acceptable server).
        timedatectl status | grep "NTP service"
        # Should show 'active'.

Phase 4: Install and Configure SLURM & Munge (Revised for RPi)

(Perform steps on nodes as indicated. Ensure you are logged in as your administrative user - piadmin in these examples - who has sudo privileges.)

  1. Install Munge (Authentication Service): Munge provides secure authentication between SLURM daemons. Install on all nodes.
    # Run on pi-head, pi-c01, and pi-c02
    sudo apt update
    sudo apt install -y munge libmunge-dev libmunge2
  1. Create Munge Key: A shared secret key must be generated on one node.
    • On pi-head ONLY:
        # STILL ON PI-HEAD, logged in as piadmin
        sudo systemctl stop munge # Stop service if running
        # Create the key
        sudo dd if=/dev/urandom of=/etc/munge/munge.key bs=1 count=1024
        # Set correct ownership and permissions
        sudo chown munge:munge /etc/munge/munge.key
        sudo chmod 400 /etc/munge/munge.key
        echo "Munge key created on pi-head."
  1. Securely Distribute Munge Key: Copy the key from pi-head to compute nodes using a temporary location, then move it with sudo remotely.
    • Run these command blocks from pi-head, logged in as piadmin:

      • For pi-c01:
            # ON PI-HEAD, as 'piadmin'
            echo "Copying munge.key to pi-c01:/tmp/..."
            sudo scp /etc/munge/munge.key piadmin@pi-c01:/tmp/munge.key.tmp
            # Enter piadmin's password for pi-c01 if prompted by scp

            echo "Connecting to pi-c01 to move munge.key and set permissions..."
            ssh -t piadmin@pi-c01 << EOF
            sudo systemctl stop munge # Ensure service is stopped before replacing key
            sudo mv /tmp/munge.key.tmp /etc/munge/munge.key
            sudo chown munge:munge /etc/munge/munge.key
            sudo chmod 400 /etc/munge/munge.key
            echo "--- Verification on pi-c01 ---"
            sudo ls -l /etc/munge/munge.key # Needs sudo to view details
            echo "--- Done on pi-c01 ---"
            EOF
            # You will likely be prompted for piadmin's password for pi-c01 here by sudo
    *   **For `pi-c02`:**
            # ON PI-HEAD, as 'piadmin'
            echo "Copying munge.key to pi-c02:/tmp/..."
            sudo scp /etc/munge/munge.key piadmin@pi-c02:/tmp/munge.key.tmp
            # Enter piadmin's password for pi-c02 if prompted by scp

            echo "Connecting to pi-c02 to move munge.key and set permissions..."
            ssh -t piadmin@pi-c02 << EOF
            sudo systemctl stop munge # Ensure service is stopped before replacing key
            sudo mv /tmp/munge.key.tmp /etc/munge/munge.key
            sudo chown munge:munge /etc/munge/munge.key
            sudo chmod 400 /etc/munge/munge.key
            echo "--- Verification on pi-c02 ---"
            sudo ls -l /etc/munge/munge.key # Needs sudo to view details
            echo "--- Done on pi-c02 ---"
            EOF
            # You will likely be prompted for piadmin's password for pi-c02 here by sudo
  1. Start and Enable Munge Service: On all nodes:
    # Run on pi-head, pi-c01, and pi-c02
    sudo systemctl start munge
    sudo systemctl enable munge
    # Verify status
    sudo systemctl status munge
    # Check the status is active (running) on all three nodes.
  1. Test Munge Communication:
    • From pi-head (as piadmin or cuser):
        # Test local encoding/decoding
        munge -n | unmunge
        # Test head -> c01 (may need passwordless SSH for cuser setup first, or run as piadmin)
        munge -n | ssh pi-c01 unmunge
        # Test head -> c02
        munge -n | ssh pi-c02 unmunge
        # Test c01 -> head (round trip)
        ssh pi-c01 munge -n | unmunge
*   All tests should return a `STATUS: Success (...)` line. If not, double-check `munge.key` consistency (e.g., file size), permissions (`sudo ls -l /etc/munge/munge.key` on all nodes), and `munged` service status (`sudo systemctl status munge` on all nodes). Also check `/var/log/munge/munged.log`.
  1. Install SLURM: Install the SLURM workload manager packages on all nodes.
    # Run on pi-head, pi-c01, and pi-c02
    sudo apt update
    sudo apt install -y slurm-wlm slurm-wlm-doc # slurm-wlm pulls in slurmd, slurmctld etc.
  1. Configure SLURM (slurm.conf):
    • Create the configuration file on pi-head first.
    • Edit the main config file using nano: sudo nano /etc/slurm/slurm.conf
    • Replace the entire content with the following.
      • Adjust RealMemory: Use free -m to see total memory in MiB. Leave some (~200-300MB) for the OS. For an 8GB Pi (approx 7850MB usable), 7600 is a reasonable starting point.
      • CPUs: RPi 5 has 4 cores.
        # /etc/slurm/slurm.conf
        # Basic SLURM configuration for pi-cluster
        ClusterName=pi-cluster
        SlurmctldHost=pi-head #(Or use IP 10.0.0.1)
        # SlurmctldHost=pi-head(10.0.0.1) # Optional: Specify both
        MpiDefault=none
        ProctrackType=proctrack/cgroup
        ReturnToService=1
        SlurmctldPidFile=/run/slurmctld.pid
        SlurmdPidFile=/run/slurmd.pid
        SlurmctldPort=6817
        SlurmdPort=6818
        AuthType=auth/munge
        StateSaveLocation=/var/spool/slurmctld
        SlurmdSpoolDir=/var/spool/slurmd
        SwitchType=switch/none
        TaskPlugin=task/cgroup
        # LOGGING
        SlurmctldLogFile=/var/log/slurm/slurmctld.log
        SlurmdLogFile=/var/log/slurm/slurmd.log
        JobCompType=jobcomp/none # No job completion logging for basic setup
        # TIMERS
        SlurmctldTimeout=120
        SlurmdTimeout=300
        InactiveLimit=0
        MinJobAge=300
        KillWait=30
        Waittime=0
        # SCHEDULING
        SchedulerType=sched/backfill
        SelectType=select/cons_tres # Use cons_tres for memory tracking
        SelectTypeParameters=CR_Core_Memory # Track Cores and Memory
        # NODES - Adjust RealMemory based on your Pi 5 8GB (~7600 is conservative)
        NodeName=pi-head NodeAddr=10.0.0.1 CPUs=4 RealMemory=7600 State=UNKNOWN
        NodeName=pi-c01 NodeAddr=10.0.0.2 CPUs=4 RealMemory=7600 State=UNKNOWN
        NodeName=pi-c02 NodeAddr=10.0.0.3 CPUs=4 RealMemory=7600 State=UNKNOWN
        # PARTITION
        PartitionName=rpi_part Nodes=pi-head,pi-c01,pi-c02 Default=YES MaxTime=INFINITE State=UP Oversubscribe=NO
    *   Save the file in `nano` (Ctrl+O, Enter) and exit (Ctrl+X).
*   Create the SLURM log and spool directories **on all nodes**:
        # Run on pi-head, pi-c01, and pi-c02
        sudo mkdir -p /var/log/slurm /var/spool/slurmctld /var/spool/slurmd
        # Verify slurm user/group exists (created by package install)
        id slurm
        # Set ownership to the 'slurm' user/group
        sudo chown slurm:slurm /var/log/slurm /var/spool/slurmctld /var/spool/slurmd
        sudo chmod 755 /var/log/slurm /var/spool/slurmctld /var/spool/slurmd
*   Copy the `slurm.conf` file from `pi-head` to the compute nodes using the two-step method:
    *   **Run these command blocks *from* `pi-head`, logged in as `piadmin`:**

        *   **For `pi-c01`:**
                # ON PI-HEAD, as 'piadmin'
                echo "Copying slurm.conf to pi-c01:/tmp/..."
                sudo scp /etc/slurm/slurm.conf piadmin@pi-c01:/tmp/slurm.conf
                # Enter piadmin's password for pi-c01 if prompted by scp

                echo "Connecting to pi-c01 to move slurm.conf and set permissions..."
                ssh -t piadmin@pi-c01 << EOF
                sudo mv /tmp/slurm.conf /etc/slurm/slurm.conf
                sudo chown root:root /etc/slurm/slurm.conf # slurm.conf owned by root
                sudo chmod 644 /etc/slurm/slurm.conf     # Read access for all
                echo "--- Verification on pi-c01 ---"
                ls -l /etc/slurm/slurm.conf
                echo "--- Done on pi-c01 ---"
                EOF
                # You will likely be prompted for piadmin's password for pi-c01 here by sudo
        *   **For `pi-c02`:**
                # ON PI-HEAD, as 'piadmin'
                echo "Copying slurm.conf to pi-c02:/tmp/..."
                sudo scp /etc/slurm/slurm.conf piadmin@pi-c02:/tmp/slurm.conf
                # Enter piadmin's password for pi-c02 if prompted by scp

                echo "Connecting to pi-c02 to move slurm.conf and set permissions..."
                ssh -t piadmin@pi-c02 << EOF
                sudo mv /tmp/slurm.conf /etc/slurm/slurm.conf
                sudo chown root:root /etc/slurm/slurm.conf # slurm.conf owned by root
                sudo chmod 644 /etc/slurm/slurm.conf     # Read access for all
                echo "--- Verification on pi-c02 ---"
                ls -l /etc/slurm/slurm.conf
                echo "--- Done on pi-c02 ---"
                EOF
                # You will likely be prompted for piadmin's password for pi-c02 here by sudo
  1. Configure Cgroup Plugin (cgroup.conf): Needed for resource constraint (ProctrackType=proctrack/cgroup, TaskPlugin=task/cgroup, SelectType=select/cons_tres).
    • Create /etc/slurm/cgroup.conf on pi-head first: sudo nano /etc/slurm/cgroup.conf
    • Add the following content:
        # /etc/slurm/cgroup.conf
        CgroupAutomount=yes
        CgroupReleaseAgentDir="/etc/slurm/cgroup"
        ConstrainCores=yes
        ConstrainDevices=yes
        ConstrainRAMSpace=yes
        # If using systemd (default on RPi OS Bookworm), TaskAffinity should generally be no
        TaskAffinity=no
    *   Save the file (Ctrl+O, Enter) and exit (Ctrl+X).
*   Create the `CgroupReleaseAgentDir` **on all nodes**:
        # Run on pi-head, pi-c01, and pi-c02
        sudo mkdir -p /etc/slurm/cgroup
        # Ownership might vary, slurm:slurm or root:root can work. Start with slurm:slurm.
        sudo chown slurm:slurm /etc/slurm/cgroup
*   Copy `cgroup.conf` from `pi-head` to compute nodes using the two-step method:
    *   **Run these command blocks *from* `pi-head`, logged in as `piadmin`:**

        *   **For `pi-c01`:**
                # ON PI-HEAD, as 'piadmin'
                echo "Copying cgroup.conf to pi-c01:/tmp/..."
                sudo scp /etc/slurm/cgroup.conf piadmin@pi-c01:/tmp/cgroup.conf
                # Enter piadmin's password for pi-c01 if prompted by scp

                echo "Connecting to pi-c01 to move cgroup.conf and set permissions..."
                ssh -t piadmin@pi-c01 << EOF
                sudo mv /tmp/cgroup.conf /etc/slurm/cgroup.conf
                sudo chown root:root /etc/slurm/cgroup.conf # cgroup.conf owned by root
                sudo chmod 644 /etc/slurm/cgroup.conf     # Read access for all
                echo "--- Verification on pi-c01 ---"
                ls -l /etc/slurm/cgroup.conf
                echo "--- Done on pi-c01 ---"
                EOF
                # You will likely be prompted for piadmin's password for pi-c01 here by sudo
        *   **For `pi-c02`:**
                # ON PI-HEAD, as 'piadmin'
                echo "Copying cgroup.conf to pi-c02:/tmp/..."
                sudo scp /etc/slurm/cgroup.conf piadmin@pi-c02:/tmp/cgroup.conf
                # Enter piadmin's password for pi-c02 if prompted by scp

                echo "Connecting to pi-c02 to move cgroup.conf and set permissions..."
                ssh -t piadmin@pi-c02 << EOF
                sudo mv /tmp/cgroup.conf /etc/slurm/cgroup.conf
                sudo chown root:root /etc/slurm/cgroup.conf # cgroup.conf owned by root
                sudo chmod 644 /etc/slurm/cgroup.conf     # Read access for all
                echo "--- Verification on pi-c02 ---"
                ls -l /etc/slurm/cgroup.conf
                echo "--- Done on pi-c02 ---"
                EOF
                # You will likely be prompted for piadmin's password for pi-c02 here by sudo
  1. Start SLURM Services:
    • On pi-head (Controller):
        # ON PI-HEAD, as piadmin
        sudo systemctl enable slurmctld.service
        sudo systemctl start slurmctld.service
        # Check status immediately
        sudo systemctl status slurmctld.service
        # Check logs if status isn't active (running)
        journalctl -u slurmctld.service | tail -n 30
        tail -n 30 /var/log/slurm/slurmctld.log
*   **On ALL nodes (Compute Daemons - including `pi-head`):**
        # Run ON pi-head, pi-c01, AND pi-c02, as piadmin
        sudo systemctl enable slurmd.service
        sudo systemctl start slurmd.service
        # Check status on each node
        sudo systemctl status slurmd.service
        # Check logs on each node if status isn't active (running)
        tail -n 30 /var/log/slurm/slurmd.log
  1. Verify SLURM Cluster Status:
    • Wait ~10-20 seconds for nodes to register. Run on pi-head (as piadmin or cuser):
        sinfo
        # Expected output (might take a moment to show 'idle'):
        # PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
        # rpi_part*    up   infinite      3   idle pi-head,pi-c0[1-2]
        # (State might be 'unk', 'down', 'drain' initially, or 'mix' if nodes are registering/alloc)

        scontrol show node
        # Check details for each node. Look for 'State=IDLE'. If 'State=DOWN' or 'State=DRAINED(Reason=...)', check logs:
        # - /var/log/slurm/slurmctld.log on pi-head (essential for controller decisions)
        # - /var/log/slurm/slurmd.log on the affected node(s) (essential for why a node isn't running)

        # If nodes are down/drained due to initial errors that are now fixed:
        # sudo scontrol update nodename=pi-head,pi-c01,pi-c02 state=resume
        # Then check 'sinfo' again after a few seconds.
*   **Common Issues & Log Checks:**
    *   **Time Sync:** `slurmctld` log might show "Time skew detected". Ensure `chrony` is running and synced on all nodes (`chronyc sources`).
    *   **Munge Auth:** `slurmctld` or `slurmd` logs might show "Invalid credential" or "Munge authentication failed". Re-verify Munge setup (key, permissions, service status, `munge -n | ssh ... unmunge` tests).
    *   **Network/Ports:** `slurmctld` log might show "Unable to contact slurmd" or `slurmd` log might show "Unable to contact slurmctld". Check firewall rules (if any beyond the `nftables` NAT setup) and ensure nodes can ping each other by hostname and IP (`10.0.0.x`). Ensure `slurm.conf` has correct `SlurmctldHost`, `NodeName`, `NodeAddr`.
    *   **Config Errors:** Logs might report invalid parameters in `slurm.conf` or `cgroup.conf`.
    *   **Spool/Log Dirs:** Logs might show permission errors writing to `/var/spool/slurm*` or `/var/log/slurm`. Verify ownership (`slurm:slurm`) and permissions (`755`).
    *   **Cgroup Issues:** `slurmd` logs might show errors related to cgroups. Ensure `TaskPlugin=task/cgroup` and `ProctrackType=proctrack/cgroup` are set, `cgroup.conf` is present and correct, and the `CgroupReleaseAgentDir` exists.

Phase 5: Testing the SLURM Cluster

(Run these commands as cuser on pi-head)

  1. Login as cuser:
    su - cuser
    # Or: ssh cuser@pi-head
    cd /clusterfs # Work in the shared filesystem if desired
  1. Run a Simple Command Interactively:
    srun hostname
    # Runs 'hostname' on one available node in the default partition.
  1. Run Command on Specific Number of Nodes:
    # Run hostname on 2 different nodes, 1 task per node
    srun --nodes=2 --ntasks-per-node=1 hostname | sort
    # Should show two different hostnames (e.g., pi-c01, pi-c02 or pi-head, pi-c01)
  1. Submit a Simple Batch Job:
    • Create a job script file, e.g., /clusterfs/cuser/hello.sh (ensure /clusterfs/cuser exists and is writable by cuser):
        #!/bin/bash
        #SBATCH --job-name=hello       # Job name
        #SBATCH --output=hello_job_%j.out # Standard output file (%j = job ID)
        #SBATCH --error=hello_job_%j.err  # Standard error file
        #SBATCH --nodes=3                 # Request all 3 nodes
        #SBATCH --ntasks-per-node=2       # Request 2 tasks (processes) per node (total 6)
        #SBATCH --cpus-per-task=1         # Request 1 CPU core per task
        #SBATCH --partition=rpi_part      # Specify partition (optional if default)
        #SBATCH --time=00:05:00           # Time limit (5 minutes)

        echo "Job running on nodes:"
        srun hostname | sort # Use srun within sbatch to launch parallel tasks

        echo "Tasks started at: $(date)"
        sleep 20 # Simulate some work
        echo "Tasks finished at: $(date)"
*   Make the script executable: `chmod +x /clusterfs/cuser/hello.sh`
*   Submit the job from the directory containing the script:
        sbatch hello.sh
        # Should print: Submitted batch job <JOB_ID>
*   Check the queue:
        squeue
        # Shows running or pending jobs
        watch squeue # Monitor queue updates
        sinfo
        # Should show nodes in 'alloc' or 'mix' state.
*   Once the job finishes (disappears from `squeue`), check the output files (`hello_job_<JOB_ID>.out` and `.err`) in the submission directory:
        cat hello_job_<JOB_ID>.out
        # Should show hostnames from all 3 nodes, likely repeated if ntasks > nnodes * ntasks-per-node setting is used correctly.
        # In this example, it should list pi-head, pi-c01, pi-c02 each twice.

Congratulations!

You should now have a functional 3-node Raspberry Pi 5 SLURM cluster. The compute nodes (pi-c01, pi-c02) use the head node (pi-head) as a gateway for internet access, while all cluster communication happens over the private 10.0.0.x network.

Next Steps & Considerations

  • Install MPI: Install OpenMPI or MPICH (sudo apt install -y openmpi-bin libopenmpi-dev on all nodes) to run parallel MPI applications. Update SLURM’s MpiDefault=pmix or configure MPI properly if needed.
  • Shared Software Stack: Install compilers, libraries, and applications needed for your HPC tasks onto the shared NFS filesystem (/clusterfs) so they are accessible from all nodes without needing installation everywhere. Modules systems like Lmod can help manage this.
  • Monitoring: Set up monitoring tools like htop, glances, or more comprehensive systems like Prometheus + Grafana or Ganglia to observe cluster load and resource usage.
  • SLURM Tuning: Explore more advanced slurm.conf options: resource limits (memory, cores per job/user), Quality of Service (QoS), fair-share scheduling, job arrays.
  • SLURM Accounting: For tracking resource usage over time, set up the SLURM accounting database (slurmdbd) which requires installing and configuring a database (like MariaDB/MySQL).
  • Security: Review iptables rules, harden SSH (/etc/ssh/sshd_config), and consider user permissions carefully.
  • Backup: Back up your slurm.conf, munge.key, and important data on /clusterfs.

Enjoy your mini HPC cluster!