Deploy ClickHouse Cluster
3-shard ClickHouse cluster with 2x replication
ClickHouse Keeper K1
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse S1R2
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse S2R1
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse S2R2
railwayapp-templates/clickhouse-cluster
Just deployed
Direct Proxy
railwayapp-templates/clickhouse-cluster
Just deployed
Main Proxy
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse S1R1
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse Keeper K3
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse S3R2
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse S3R1
railwayapp-templates/clickhouse-cluster
Just deployed
ClickHouse Keeper K2
railwayapp-templates/clickhouse-cluster
Just deployed
Deploy and Host ClickHouse Cluster on Railway
ClickHouse is a fast open-source column-oriented database management system that provides high-performance analytics and real-time data processing capabilities. It is designed for online analytical processing (OLAP) workloads and is widely used for data warehousing, business intelligence, and real-time analytics applications that require processing large volumes of data.
About Hosting ClickHouse Cluster
Hosting a ClickHouse cluster gives you access to a distributed analytical database capable of handling massive concurrent queries, managing terabyte-scale data persistence, and supporting high availability across multiple nodes. This template provides a pre-configured cluster with 3 shards and 2 replicas per shard, and efficient columnar storage with zstd compression enabled by default. The database excels at real-time analytics, complex aggregation queries, and distributed query processing across cluster nodes. ClickHouse cluster deployments benefit from scalable CPU, RAM, and storage resources while supporting network security through Railway's private network capabilities. Railway provides automated backup systems and comprehensive logging to support your distributed database operations.
Common Use Cases
-
Real-time Analytics and Business Intelligence: Powering dashboards, reporting systems, and data visualization tools that require sub-second query response times across billions of records for e-commerce analytics, user behavior tracking, and operational monitoring.
-
Data Warehousing and ETL Processing: Serving as the primary analytical database for data lakes, ETL pipelines, and data transformation workflows that process large volumes of structured and semi-structured data from multiple sources.
-
Time-Series and Event Data Analysis: Managing high-velocity time-series data, application logs, IoT sensor data, and event streams that require efficient compression, fast ingestion, and complex temporal queries.
-
Machine Learning and Data Science: Supporting feature engineering, model training data preparation, and real-time scoring pipelines that require fast aggregations and statistical computations across large datasets.
Dependencies for ClickHouse Cluster Hosting
clickhouse-keeper
- For cluster coordination and metadata managementhaproxy
- For load balancing and connection management
Deployment Dependencies
- The official ClickHouse Server image - https://hub.docker.com/r/clickhouse/clickhouse-server
- The official ClickHouse Keeper image - https://hub.docker.com/r/clickhouse/clickhouse-keeper
- The official HAProxy image - https://hub.docker.com/_/haproxy
- Custom cluster configuration files
Implementation Details
This template deploys a ClickHouse cluster with 3 shards and 2 replicas per shard, totaling 6 ClickHouse server nodes plus ClickHouse Keeper ensemble for coordination and HAProxy for load balancing.
Cluster Architecture
The cluster is configured with the following topology:
- 3 Shards: Data is horizontally partitioned across three shard groups
- 2 Replicas per Shard: Each shard has two replica nodes for high availability
- ClickHouse Keeper Ensemble: 3-node ClickHouse Keeper cluster for metadata and coordination
- HAProxy Load Balancer: Routes client connections across healthy ClickHouse nodes
Cluster Layout:
Shard 1: ClickHouse S1R1, ClickHouse S1R2
Shard 2: ClickHouse S2R1, ClickHouse S2R2
Shard 3: ClickHouse S3R1, ClickHouse S3R2
Configuration Files
The deployment includes custom configuration files:
config.xml
- Main server configuration with cluster definition, ClickHouse Keeper settings, and network configurationusers.xml
- User authentication and permission settingscluster.xml
- Distributed table configuration and shard mappings
ClickHouse Keeper Integration
ClickHouse Keeper is used for:
- Replica synchronization and consistency
- Distributed DDL operations
- Leader election for ReplicatedMergeTree tables
- Cluster metadata management
Data Distribution
In the following example, we will create a database, a local table, and a distributed table on the cluster.
Tables are created using the Distributed
engine with automatic sharding based on a hash function of the primary key. ReplicatedMergeTree tables ensure data replication within each shard.
Notes:
- When
'{cluster}'
is used, it is not a placeholder that the user needs to replace with the cluster name. It is a placeholder that ClickHouse will automatically replace with the cluster name when the query is executed. - The
CODEC (ZSTD)
is used to compress the data further. This is a good idea for large tables with a lot of data. - The
TTL toDateTime(TimeStamp) + INTERVAL 90 DAY;
is used to delete data older than 90 days. - The
PARTITION BY toYYYYMM(TimeStamp)
is used to partition the data by month. - The
ORDER BY (TimeStamp, EventId)
is used to order the data by timestamp and event id, ClickHouse will also index the timestamp and event id. - The
ReplicatedMergeTree
is used to replicate the data across the cluster.
Create a database on the cluster:
CREATE DATABASE IF NOT EXISTS events_database ON CLUSTER '{cluster}';
This database is where you will create your local and distributed tables.
Create a local table on the cluster:
CREATE TABLE IF NOT EXISTS events_database.events_local ON CLUSTER '{cluster}' (
-- timestamp
TimeStamp DateTime64(3) CODEC (Delta, ZSTD),
-- event data
EventId String CODEC (ZSTD),
EventType String CODEC (ZSTD),
EventData String CODEC (ZSTD),
EventSource String CODEC (ZSTD),
EventSeverity String CODEC (ZSTD),
EventStatus String CODEC (ZSTD),
EventTags String CODEC (ZSTD)
) ENGINE = ReplicatedMergeTree('/clickhouse/{installation}/{cluster}/tables/{shard}-{uuid}/{database}/{table}', '{replica}')
PARTITION BY toYYYYMM(TimeStamp)
ORDER BY (TimeStamp, EventId)
TTL toDateTime(TimeStamp) + INTERVAL 90 DAY;
This table is the backing table for the distributed table, and it has a TTL of 90 days, meaning data older than 90 days will be automatically deleted.
Create a distributed table on the cluster:
CREATE TABLE IF NOT EXISTS events_database.events ON CLUSTER '{cluster}'
AS events_database.events_local ENGINE = Distributed('{cluster}', events_database, events_local, rand());
This events_database.events
table is what your code will read or write to.
Modifying data in the distributed table
ALTER TABLE events_database.events_local ON CLUSTER '{cluster}' DELETE WHERE EventType = 'error';
This will delete all rows where the EventType is 'error'.
Note: SQL that modifies the data must be ran on the local table, not the distributed table.
High Availability Features
- Automatic Failover: If a replica fails, queries automatically route to healthy replicas
- Data Synchronization: ReplicatedMergeTree ensures eventual consistency across replicas
- Rolling Updates: Cluster can be updated one node at a time without downtime (This would also apply if you ever need to grow a volume)
Proxy Configuration
This template comes with two proxies, a direct proxy and a main proxy.
-
Direct Proxy
- The direct Proxy provides a direct connection to theClickHouse S1R1
node, use this for running migrations. -
Main Proxy
- The main Proxy provides a round robin connection to theClickHouse S{1/2/3}R1
nodes, use this for running queries and inserting data.
Environment Variables
Key environment variables.
On the Direct Proxy:
-
CLICKHOUSE_PUBLIC_DIRECT_HTTP_URL
- The public HTTP URL of the direct proxy. -
CLICKHOUSE_PRIVATE_DIRECT_HTTP_URL
- The private HTTP URL of the direct proxy (a reference variable to the ClickHouse S1R1 node's private domain and port). -
CLICKHOUSE_PUBLIC_DIRECT_TCP_URL
- The public TCP URL of the direct proxy. -
CLICKHOUSE_PRIVATE_DIRECT_TCP_URL
- The private TCP URL of the direct proxy (a reference variable to the ClickHouse S1R1 node's private domain and port).
On the Main Proxy:
-
CLICKHOUSE_PUBLIC_HTTP_URL
- The public HTTP URL of the main proxy. -
CLICKHOUSE_PRIVATE_HTTP_URL
- The private HTTP URL of the main proxy. -
CLICKHOUSE_PUBLIC_TCP_URL
- The public TCP URL of the main proxy. -
CLICKHOUSE_PRIVATE_TCP_URL
- The private TCP URL of the main proxy.
On the ClickHouse S1R1
node:
-
CH_USER
- The user to connect to the database. -
CH_PASSWORD
- The automatically generated password to connect to the database.
Note: The TCP and HTTP URLs already have the CH_USER
and CH_PASSWORD
variables injected into them.
Why Deploy ClickHouse Cluster on Railway?
Railway is a singular platform to deploy your infrastructure stack. Railway will host your infrastructure so you don't have to deal with configuration, while allowing you to vertically scale it.
By deploying a ClickHouse cluster on Railway, you are one step closer to supporting a complete analytical data stack with minimal burden. Host your servers, databases, AI agents, and more on Railway.
Template Content
ClickHouse Keeper K1
railwayapp-templates/clickhouse-clusterClickHouse S1R2
railwayapp-templates/clickhouse-clusterClickHouse S2R1
railwayapp-templates/clickhouse-clusterClickHouse S2R2
railwayapp-templates/clickhouse-clusterDirect Proxy
railwayapp-templates/clickhouse-clusterMain Proxy
railwayapp-templates/clickhouse-clusterClickHouse S1R1
railwayapp-templates/clickhouse-clusterClickHouse Keeper K3
railwayapp-templates/clickhouse-clusterClickHouse S3R2
railwayapp-templates/clickhouse-clusterClickHouse S3R1
railwayapp-templates/clickhouse-clusterClickHouse Keeper K2
railwayapp-templates/clickhouse-cluster