Promptfoo logo

Deploy and Host Promptfoo on Railway

Deploy Promptfoo on Railway to get a self-hosted LLM evaluation and red-teaming platform running in minutes. Promptfoo is the open-source framework used by 25% of Fortune 500 companies to benchmark prompts, compare models, and catch AI vulnerabilities before they reach production.

Self-host Promptfoo on Railway and get a pre-configured instance with persistent SQLite storage, health monitoring, and a public URL — no infrastructure management required. The template deploys the official ghcr.io/promptfoo/promptfoo Docker image with a volume for durable eval storage.

Getting Started with Promptfoo on Railway

After deployment completes, open your Railway-provided URL to access the Promptfoo web UI. The dashboard displays your evaluation history, allowing you to browse, compare, and share results across your team.

To run your first evaluation, install the Promptfoo CLI locally with npm install -g promptfoo and point it at your self-hosted instance. Set your remote endpoint:

promptfoo config set remote_api_base_url https://your-railway-url
promptfoo config set remote_app_base_url https://your-railway-url

Create a basic evaluation config file named promptfooconfig.yaml:

providers:
  - openai:gpt-4o
  - anthropic:messages:claude-sonnet-4-20250514
prompts:
  - "Summarize this text: {{input}}"
tests:
  - vars:
      input: "Railway is a deployment platform for developers."
    assert:
      - type: contains
        value: "Railway"

Run promptfoo eval to execute the tests locally, then promptfoo share to push results to your Railway-hosted dashboard for team review.

Promptfoo dashboard screenshot

About Hosting Promptfoo

Promptfoo is an MIT-licensed LLM evaluation framework (acquired by OpenAI in March 2026) that lets developers systematically test prompts, compare model outputs, and run security audits against AI applications. It supports 50+ LLM providers out of the box — including OpenAI, Anthropic, Google, AWS Bedrock, Ollama, and custom HTTP endpoints.

Key features:

Side-by-side model comparison with YAML-configured test matrices
Red teaming and security scanning aligned with OWASP and NIST AI standards
RAG pipeline evaluation measuring factuality, relevance, and faithfulness
CI/CD integration via GitHub Actions, Jest, and Vitest for automated quality gates
Local-first architecture — API calls go directly to providers, never through intermediaries

Why Deploy Promptfoo on Railway

Railway eliminates the Docker and infrastructure setup so you can focus on evaluation:

One-click deploy with persistent storage for eval history
Automatic HTTPS and public URL for team sharing
Volume-backed SQLite ensures data survives redeploys
Scale resources on demand for large evaluation runs
No vendor lock-in — MIT-licensed, fully open source

Common Use Cases for Self-Hosted Promptfoo

Prompt engineering workflows — compare output quality across GPT-4o, Claude, Gemini, and open-source models with consistent test cases
Pre-production security audits — scan LLM apps for prompt injection, jailbreaks, and data poisoning using automated red team probes
RAG quality measurement — evaluate retrieval accuracy, answer faithfulness, and context relevance across your knowledge base pipeline
CI/CD quality gates — block deployments when LLM outputs regress below defined thresholds in your GitHub Actions workflow

Dependencies for Promptfoo on Railway

Promptfoo — ghcr.io/promptfoo/promptfoo:latest (Node.js 24 + Python 3.12 on Alpine)
SQLite — embedded, no external database service required
Volume — /home/promptfoo/.promptfoo for persistent eval storage

Environment Variables Reference for Promptfoo

Variable	Value	Purpose
`PORT`	`3000`	HTTP server listening port
`PROMPTFOO_REMOTE_API_BASE_URL`	`https://${{RAILWAY_PUBLIC_DOMAIN}}`	CLI remote API endpoint
`PROMPTFOO_REMOTE_APP_BASE_URL`	`https://${{RAILWAY_PUBLIC_DOMAIN}}`	CLI remote app endpoint
`PROMPTFOO_DISABLE_TELEMETRY`	`1`	Disable usage telemetry
`OPENAI_API_KEY`	User-provided	OpenAI API key for evals
`ANTHROPIC_API_KEY`	User-provided	Anthropic API key for evals

Deployment Dependencies

Docker image: ghcr.io/promptfoo/promptfoo
GitHub: promptfoo/promptfoo
Docs: promptfoo.dev/docs
Runtime: Node.js 24, Python 3.12

Hardware Requirements for Self-Hosting Promptfoo

Resource	Minimum	Recommended
CPU	1 core	4+ cores
RAM	2 GB	8 GB
Storage	5 GB	100 GB SSD
Runtime	Node.js 20+	Node.js 24

Promptfoo itself is lightweight — most compute happens on the LLM provider side. Storage scales with evaluation history and externalized media (images, audio blobs).

Self-Hosting Promptfoo with Docker

Run Promptfoo locally with Docker using a single command:

docker run -d \
  --name promptfoo \
  -p 3000:3000 \
  -v promptfoo-data:/home/promptfoo/.promptfoo \
  ghcr.io/promptfoo/promptfoo:latest

For a docker-compose setup:

services:
  promptfoo:
    image: ghcr.io/promptfoo/promptfoo:latest
    ports:
      - "3000:3000"
    volumes:
      - promptfoo-data:/home/promptfoo/.promptfoo
    environment:
      - PROMPTFOO_DISABLE_TELEMETRY=1
      - PROMPTFOO_DISABLE_UPDATE=1
volumes:
  promptfoo-data:

Access the web UI at http://localhost:3000 after startup.

How Much Does Promptfoo Cost to Self-Host?

Promptfoo is fully open-source under the MIT license — the software itself is free. Self-hosting on Railway means you only pay for infrastructure (compute, storage, bandwidth). A basic Promptfoo instance on Railway runs comfortably on the Hobby plan.

LLM API costs are separate and depend on your evaluation volume and chosen providers. Running 100 test cases against GPT-4o typically costs $0.50–$2.00. Using local models via Ollama eliminates API costs entirely.

Promptfoo vs LangSmith for LLM Evaluation

Feature	Promptfoo	LangSmith
License	MIT (open source)	Proprietary
Architecture	CLI-first, local execution	Cloud-first platform
Red teaming	Built-in OWASP/NIST scanner	Limited
Provider support	50+ providers	LangChain ecosystem
Self-hosting	Full feature parity	Enterprise only
Pricing	Free (self-hosted)	Free tier + paid plans

Promptfoo is the stronger choice for teams that want local-first evaluation with deep security testing. LangSmith excels when your stack is already built on LangChain and you want integrated observability.

FAQ

What is Promptfoo and why should I self-host it? Promptfoo is an open-source LLM evaluation framework for testing prompts, comparing models, and running security audits. Self-hosting gives you full control over your evaluation data, eliminates cloud dependencies, and lets you run evaluations against internal or private models without sending data to third parties.

What does this Railway template deploy for Promptfoo? This template deploys a single Promptfoo server instance with the official Docker image (ghcr.io/promptfoo/promptfoo:latest), a persistent volume for SQLite storage at /home/promptfoo/.promptfoo, and a public HTTPS URL. No external database is required — Promptfoo uses embedded SQLite.

Why does Promptfoo use SQLite instead of PostgreSQL? Promptfoo is designed as a lightweight, single-user or small-team tool where SQLite provides sufficient performance without the overhead of managing a separate database service. The trade-off is that horizontal scaling across multiple replicas is not supported.

How do I connect the Promptfoo CLI to my self-hosted Railway instance? Install the CLI with npm install -g promptfoo, then run promptfoo config set remote_api_base_url https://your-railway-url and promptfoo config set remote_app_base_url https://your-railway-url. After configuration, use promptfoo share to push evaluation results to your Railway-hosted dashboard.

Does self-hosted Promptfoo on Railway support red teaming and security scanning? Yes. Run promptfoo redteam init to scaffold a red team project with OWASP and NIST presets. The scanner tests for prompt injection, jailbreaks, data poisoning, and other LLM vulnerabilities. Results are viewable in the self-hosted web UI.

How do I add LLM API keys to Promptfoo on Railway? Add provider API keys (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY) as environment variables in your Railway service settings. The Promptfoo server reads these at runtime for server-side evaluations. For CLI-only evaluations, set the keys in your local environment instead.