Deploy Apache Kafka — Open Source RabbitMQ Alternative for High-Throughput Streaming
Self-host Kafka with KRaft mode, pub/sub, log replay, Flink/Spark-ready
Kafka
Just deployed
/var/lib/kafka/data
![]()
Deploy and Host Apache Kafka
Apache Kafka is a distributed event streaming platform built for high-throughput, fault-tolerant, real-time data pipelines. It lets applications publish, subscribe to, store, and replay streams of records at massive scale — from microservice event buses to analytics pipelines handling trillions of events per day. This Railway template deploys a single-node Kafka broker in KRaft mode (no ZooKeeper required), using the official apache/kafka Docker image, with a persistent volume at /var/lib/kafka/data, pre-configured dual listeners for both private Railway networking and external TCP access via Railway's TCP proxy.
Getting Started with Apache Kafka on Railway
Once your Kafka service is live, grab the TCP proxy domain and port from the Railway service's "Networking" tab — this is your external bootstrap_servers value. To verify the deployment, install kafka-python and run the test script below. If messages are produced and consumed successfully, your broker is fully operational. For services running inside the same Railway project, connect via the private domain on port 9092 using the RAILWAY_PRIVATE_DOMAIN variable — no TCP proxy needed, and no extra cost.
Verifying Your Deployment
from kafka import KafkaProducer, KafkaConsumer
# Use your Railway TCP proxy address, e.g. "metro.proxy.rlwy.net:33476"
BROKER = ":"
producer = KafkaProducer(bootstrap_servers=BROKER)
producer.send("test-topic", b"hello from railway!")
producer.flush()
print("Message sent!")
consumer = KafkaConsumer(
"test-topic",
bootstrap_servers=BROKER,
auto_offset_reset="earliest",
consumer_timeout_ms=5000,
)
for msg in consumer:
print("Received:", msg.value.decode())
If you see Message sent! followed by Received: hello from railway!, Kafka is running correctly.
About Hosting Apache Kafka
Kafka is an open-source Apache Software Foundation project, implemented in Scala and Java. It uses a distributed commit-log architecture where producers write to partitioned topics stored on brokers, and consumers pull messages sequentially — retaining them indefinitely (by policy) for replay. Since Kafka 3.3, KRaft mode removes the ZooKeeper dependency, making single-node deployments dramatically simpler.
Key features:
- Dual-listener architecture — internal listener (
INTERNAL://:9092) for private Railway service-to-service traffic, external listener (EXTERNAL://:9094) exposed via TCP proxy for outside clients - KRaft combined mode — broker and controller roles run in a single process (
KAFKA_PROCESS_ROLES=broker,controller), no ZooKeeper sidecar needed - Persistent storage — topic data survives redeploys via a mounted volume at
/var/lib/kafka/data - Topic replay — consumers can rewind and re-consume any message within the retention window
- Ecosystem breadth — native integrations with Flink, Spark, ksqlDB, Kafka Connect, and most major databases via connectors
Why Deploy Apache Kafka on Railway
Railway handles the operational complexity of running Kafka so you don't have to:
- No manual listener wiring or TCP proxy configuration — it's pre-built into the template
- Private networking between Railway services at zero egress cost (port 9092 via
RAILWAY_PRIVATE_DOMAIN) - Persistent volume management with zero Docker volume flags
- One-click redeploys and automatic rollback on failure
Common Use Cases
- Microservice event bus — decouple services by publishing domain events (
OrderPlaced,UserRegistered) to topics that downstream consumers process independently - Real-time analytics pipelines — stream clickstream, IoT sensor, or log data into Kafka and consume into Flink, Spark, or ClickHouse for live dashboards
- Change data capture (CDC) — use Kafka Connect with Debezium to stream database row changes from Postgres or MySQL into downstream systems
- Async job queues — replace Redis queues with durable, replayable Kafka topics for background processing with guaranteed delivery and ordering per partition
Dependencies for Apache Kafka
apache/kafka:latest— official Docker image from the Apache Software Foundation (Docker Hub)- Volume:
/var/lib/kafka/data— persistent storage for topic logs and metadata - No ZooKeeper dependency — KRaft mode is enabled via
KAFKA_PROCESS_ROLES=broker,controller
Environment Variables Reference
| Variable | Description | Required |
|---|---|---|
KAFKA_ADVERTISED_LISTENERS | Hostnames/ports advertised to clients — pre-wired to Railway's private domain and TCP proxy | Yes |
KAFKA_LISTENERS | Actual bind addresses inside the container for INTERNAL (9092), EXTERNAL (9094), and CONTROLLER (9093) | Yes |
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR | Set to 1 for single-node; increase for multi-broker clusters | Yes |
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR | Set to 1 for single-node deployments | Yes |
KAFKA_LOG_DIRS | Path to on-disk topic storage — matches the mounted volume | Yes |
Internal KRaft wiring variables (KAFKA_CONTROLLER_QUORUM_VOTERS, KAFKA_PROCESS_ROLES, KAFKA_INTER_BROKER_LISTENER_NAME, KAFKA_CONTROLLER_LISTENER_NAMES, KAFKA_LISTENER_SECURITY_PROTOCOL_MAP) are pre-configured and do not need changes for single-node use.
Deployment Dependencies
- Docker image:
apache/kafka:latest - Official docs: kafka.apache.org
Minimum Hardware Requirements for Apache Kafka
| Resource | Minimum (dev/test) | Recommended (production) |
|---|---|---|
| CPU | 1 vCPU | 4+ vCPU |
| RAM | 1 GB | 8–16 GB |
| Storage | 10 GB SSD | 500 GB+ SSD |
| JVM Heap | 512 MB | 4–8 GB |
For the Railway Starter plan, a single-node Kafka instance is suitable for development and low-throughput workloads. Bump to a Pro plan instance for sustained production traffic.
Apache Kafka vs RabbitMQ vs Pulsar
| Feature | Apache Kafka | RabbitMQ | Apache Pulsar |
|---|---|---|---|
| Open source | ✅ Apache 2.0 | ✅ MPL 2.0 | ✅ Apache 2.0 |
| Architecture | Distributed log | AMQP broker | Compute + BookKeeper storage |
| Peak throughput | ~605 MB/s | ~38 MB/s | ~305 MB/s |
| Message replay | ✅ Configurable retention | ❌ Deleted on ACK | ✅ Tiered storage |
| ZooKeeper-free | ✅ KRaft (3.3+) | ✅ No deps | ❌ Needs ZooKeeper + BookKeeper |
| Self-hostable | ✅ | ✅ | ✅ |
| Best for | High-throughput streaming, analytics | Task queues, low-latency routing | Multi-tenant, cloud-native streaming |
Kafka is the default choice for event streaming at scale. RabbitMQ is simpler for classic task queues with complex routing. Pulsar is compelling for multi-tenant or geo-distributed setups but adds significant operational complexity.
Self-Hosting Apache Kafka
Docker (single node, KRaft mode):
# docker-compose.yml
services:
kafka:
image: apache/kafka:latest
ports:
- "9092:9092"
environment:
KAFKA_NODE_ID: 1
KAFKA_PROCESS_ROLES: broker,controller
KAFKA_LISTENERS: INTERNAL://:9092,CONTROLLER://:9093
KAFKA_ADVERTISED_LISTENERS: INTERNAL://localhost:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@localhost:9093
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_LOG_DIRS: /var/lib/kafka/data
volumes:
- kafka-data:/var/lib/kafka/data
volumes:
kafka-data:
Run with docker compose up -d. Connect producers/consumers to localhost:9092.
Is Apache Kafka Free?
Apache Kafka is fully open source under the Apache 2.0 license — free to self-host with no licensing cost. On Railway, you pay only for the compute and storage your Kafka instance consumes. Confluent offers a managed cloud version (Confluent Cloud) with a free tier of $400 in credits, after which it's metered by usage. For most self-hosted workloads, the Railway template covers everything you need at infrastructure cost only.
FAQ
What is Apache Kafka? Apache Kafka is a distributed event streaming platform originally built at LinkedIn and donated to the Apache Software Foundation. It stores messages in partitioned, replicated logs on disk, supports indefinite retention, and lets consumers replay any past message — making it the standard choice for real-time data pipelines, event sourcing, and stream processing.
Do I need ZooKeeper with this template?
No. This template uses KRaft mode (introduced in Kafka 3.3), which replaces ZooKeeper with Kafka's own Raft-based metadata management. The KAFKA_PROCESS_ROLES=broker,controller setting enables this combined single-process mode.
How do I connect from another Railway service?
Use $RAILWAY_PRIVATE_DOMAIN:9092 as your bootstrap_servers value. Private networking is free and stays within Railway's internal network. No TCP proxy required.
Can I use this in production? This single-node template is appropriate for development, staging, and low-to-medium production traffic. For high-availability production use, you'd want a multi-broker cluster with replication factors greater than 1. Railway's single-service deployment is best suited for use cases where message loss on instance restart is acceptable, or where the volume-backed persistence is sufficient.
Why does the start command remove lost+found?
Railway mounts volumes as root, which creates a lost+found directory that Kafka refuses to start with. The custom start command (rm -rf /var/lib/kafka/data/lost+found) cleans this up before Kafka initialises, preventing a startup error on first deploy.
Template Content
Kafka
apache/kafka:latest