Deploy Slurm on Railway
Play with a real Slurm scheduler on Railway
slurm-on-railway
Just deployed
Slurm on Railway
Why? Why not. POC of running Slurm controller + container-local worker nodes on Railway. Only local dependencies are railway CLI and docker.
Deployment
- Deploy with railway template - be patient, the build takes about 10 minutes but subsequent deployments will be faster.
- Create
railway.envfile with your project info:
# After deploying the template, you can get these from the URL:
https://railway.com/project/$PROJECT_ID/service/$SERVICE_ID?environmentId=$ENVIRONMENT_ID
export RAILWAY_PROJECT_ID=xxx
export RAILWAY_ENVIRONMENT_ID=xxx
export RAILWAY_SERVICE_ID=xxx
- From the
Settingstab of your project, get your public domain and port - e.g.interchange.proxy.rlwy.net:59019 - Auth your Railway CLI with
railway login - Run commands using the
client.shwrapper:
chmod +x client.sh
./client.sh
# example
./client.sh interchange.proxy.rlwy.net:59019 scontrol ping -vvvv
Using Project: xxx
Using Environment: xxx
Using Service: xxx
--- Building Local Slurm Image ---
sha256:639351c1520234413d42a3df8b0230e3a04e317af4a1e305bbc35e775e750759
--- Syncing with Railway ---
Warning: Received unknown message type: stand_by
Remote hostname detected: 8f524205061f
Warning: Received unknown message type: stand_by
--- Launching Client Container ---
scontrol: debug2: _sack_connect: connected to /run/slurm/sack.socket
Slurmctld(primary) at 8f524205061f is UP
Client script
- Builds a local Docker image
slurm-railway. - Fetches the authentication key (created at build-time) and hostname (set by Railway at runtime) from Railway
- Launch a background
sackd(Slurm Auth and Cred Kiosk) daemon inside the container to handle theauth/slurmhandshake. - Run given command / drop into bash
Deploy and Host
One-click deploy.
About Hosting
One-click deploy
Why Deploy
Play with Slurm :)
Common Use Cases
Play with Slurm :)
Dependencies for
None
Deployment Dependencies
None
Template Content
slurm-on-railway
Antvirf/slurm-on-railway
