Deploy LLM Stack
LiteLLM with Redis for Production
litellm
Just deployed
Redis
Just deployed
/data
Just deployed
/var/lib/postgresql/data
Production-ready LiteLLM proxy with PostgreSQL and Redis. Unified API for 100+ LLM providers.
Overview
This stack provides a streamlined, production-ready LiteLLM deployment with persistent storage and caching:
Core Services:
- LiteLLM - Unified proxy for 100+ LLM providers (OpenAI, Anthropic, Azure, Google, and more)
- PostgreSQL w/pgvector - Managed database for caching, logging, and vector storage
- Redis - High-performance cache for rate limiting, session management, and job queues
Primary Use Case: Deploy a production-grade LLM proxy to Railway with minimal configuration. Route requests to multiple providers through a single, unified API endpoint with built-in caching and persistence.
Alternative: Run locally using Minikube + Skaffold (see Local Development section).
π Quick Start - Railway Deployment
β¨ Recommended Method: One-Click Template Deployment
Deploy the stack to Railway in under 5 minutes:
π Deploy to Railway π
What Happens Automatically:
- β LiteLLM, PostgreSQL, and Redis services are deployed
- β PostgreSQL and Redis plugins are added and configured
- β Service-to-service networking is set up
- β Environment variables are pre-configured with Railway references
What You Need to Provide:
- LITELLM_MASTER_KEY - Generate a secure key:
openssl rand -base64 32 - (Optional) LLM provider API keys (OpenAI, Anthropic, etc.)
- (Optional) Customize
services/litellm/config.yamlfor specific LLM models
Deployment Steps:
- Click the "Deploy to Railway" button above
- Railway will prompt you for required environment variables
- Click "Deploy" and wait 3-5 minutes
- Generate a public domain for the litellm service
- Access your LiteLLM proxy at the generated URL!
π Detailed Guide: See QUICK_START_RAILWAY.md for step-by-step instructions with screenshots.
π‘ Optional: After deployment, you can detach services from the template and customize them independently.
What Gets Deployed
| Service | Port | Description | Documentation |
|---|---|---|---|
| LiteLLM | 4000 | OpenAI-compatible proxy for 100+ LLM providers. Handles API key management, load balancing, caching, and fallbacks. | services/litellm/README.md |
| PostgreSQL | - | Managed database with pgvector extension for caching, logging, and vector storage. | services/postgres-pgvector/README.md |
| Redis | - | Managed cache for rate limiting, session management, and distributed caching. | Railway Plugin |
Service Communication:
- All services communicate via Railway's internal private network (
*.railway.internal) - PostgreSQL and Redis are automatically injected as environment variables
- No manual networking configuration required
Configuration
Environment Variables
The Railway template pre-configures most variables automatically. You only need to provide:
Required:
LITELLM_MASTER_KEY- Authentication key for your LiteLLM proxy
Optional (for LLM access):
OPENAI_API_KEY- OpenAI models (GPT-4, GPT-3.5, etc.)ANTHROPIC_API_KEY- Anthropic models (Claude)AZURE_API_KEY/AZURE_API_BASE- Azure OpenAIGOOGLE_APPLICATION_CREDENTIALS- Google Vertex AI- Additional provider keys as needed
π Complete Reference: See ENV_VARIABLES_GUIDE.md for all available configuration options.
LiteLLM Configuration
Customize which LLM models are available by editing services/litellm/config.yaml:
model_list:
- model_name: gpt-4
litellm_params:
model: openai/gpt-4
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-3-opus
litellm_params:
model: anthropic/claude-3-opus-20240229
api_key: os.environ/ANTHROPIC_API_KEY
After modifying the config, push changes to your repository and Railway will automatically redeploy.
Usage Examples
Basic API Call
curl -X POST http://litellm.railway.internal:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_MASTER_KEY" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
List Available Models
curl http://litellm.railway.internal:4000/v1/models \
-H "Authorization: Bearer $LITELLM_MASTER_KEY"
Health Check
curl http://litellm.railway.internal:4000/health
Using with OpenAI SDK
from openai import OpenAI
client = OpenAI(
base_url="http://litellm.railway.internal:4000",
api_key="your-litellm-master-key"
)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Local Development
Alternative: Run Locally with Minikube
For local development and testing, you can run the stack on your machine using Kubernetes.
Prerequisites:
Quick Start:
# Start Minikube cluster
minikube start --cpus=4 --memory=8192
# Deploy all services
kubectl apply -f k8s/manifests.yaml
# Access LiteLLM via port-forwarding
kubectl port-forward svc/litellm 4000:4000
π Comprehensive Guides:
docs/local-dev/MINIKUBE_DEV_SETUP.md- Complete setup and deployment guidedocs/local-dev/SKAFFOLD_QUICKSTART.md- Hot-reload development workflowdocs/local-dev/MINIKUBE_QUICK_REFERENCE.md- Common commands and troubleshootingdocs/local-dev/KUBERNETES_DEPLOYMENT_OVERVIEW.md- Architecture deep-dive
Note: Local development requires more setup and resources than Railway deployment. Railway is recommended for most users.
Architecture
The stack uses a streamlined microservices architecture:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Railway Platform β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ β
β β LiteLLM β β
β External Clients ββββΆ β :4000 β β
β ββββββββ¬ββββββββ β
β β β
β βββββββββββββββββββΌββββββββββββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β PostgreSQLβ β Redis β β External β β
β β (Plugin) β β (Plugin) β β LLM APIs β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key Communication Paths:
- Clients β LiteLLM: External clients connect to LiteLLM proxy (port 4000)
- LiteLLM β LLM APIs: LiteLLM routes requests to configured LLM providers
- LiteLLM β PostgreSQL: Caching, logging, and request tracking
- LiteLLM β Redis: Distributed caching and rate limiting
Internal DNS: All services use Railway's private networking (service-name.railway.internal) for secure, low-latency communication.
π Architecture Deep-Dive: See docs/architecture/OVERVIEW.md for detailed information.
Troubleshooting
Common Issues
LiteLLM Won't Start
- Check service logs in Railway dashboard β Select service β "Logs" tab
- Verify
LITELLM_MASTER_KEYis set - Ensure PostgreSQL and Redis plugins show "Running" status
LLM API Errors
- Verify your LLM provider API keys are valid and have sufficient credits
- Check
services/litellm/config.yamlfor model configuration - Review LiteLLM service logs for authentication errors
Database Connection Issues
- Confirm PostgreSQL plugin is added and running
- Verify
${{Postgres.*}}variables are correctly referenced - Check LiteLLM service logs for connection errors
Redis Connection Issues
- Confirm Redis plugin is added and running
- Verify
${{Redis.REDIS_URL}}variable is correctly referenced - Check LiteLLM service logs for Redis connection errors
Need More Help?
- Check individual service READMEs in
services/directories - Review detailed local dev guides in
docs/local-dev/ - Open an issue on GitHub with logs and configuration details
Template Content
litellm
nanocreek/llm-stackLITELLM_MASTER_KEY
Redis
redis:8.2.1