What are the system requirements?

Railway handles all the infrastructure requirements. You only need a web browser to deploy and manage your application.

Is this template free to use?

Yes, this template is free to use within your existing Railway account.

Can I customize this template?

Yes, you can fully customize this template. After deployment, you will have full control of what you've deployed in your Railway project.

How do I get support for this template?

You can get support through our community forum: postgre-sql-with-pg-duckdb-and-pgvectorsc-6e3f6384, or join our Discord community for more assistance.

Deploy and Host PostgreSQL with pg_duckdb and pgvectorscale on Railway

PostgreSQL 17 combining pg_duckdb's analytical power with pgvectorscale's AI-optimized vector search. Run fast OLAP queries on data lakes while performing semantic search and embeddings operations—all in one database without separate infrastructure.

About Hosting PostgreSQL with pg_duckdb and pgvectorscale

This template combines two powerful PostgreSQL extensions into a unified database solution. pg_duckdb embeds DuckDB's analytical engine for lightning-fast queries on S3 data lakes, Parquet files, and MotherDuck datasets. pgvectorscale adds StreamingDiskANN indexing for high-performance vector similarity search, extending pgvector's capabilities. Together, they enable hybrid workloads: analytical queries on massive datasets alongside AI-powered semantic search and embeddings operations. The container is built from source using multi-stage builds, includes automatic extension initialization, and supports optional S3 configuration. Perfect for modern data applications requiring both analytical processing and AI capabilities without managing multiple database systems.

Common Use Cases

AI Analytics Platforms: Combine vector search for RAG/semantic queries with analytical queries on user behavior and metrics
Intelligent Data Lakes: Query S3/Parquet data with DuckDB while performing similarity search on document embeddings
Real-time AI Dashboards: Run analytical aggregations alongside nearest-neighbor searches for recommendations
Hybrid Search Systems: Blend traditional analytics (trends, aggregations) with semantic search (embeddings, similarity)
MLOps Pipelines: Store model embeddings and analyze training metrics/performance data in a single database

Dependencies for PostgreSQL with pg_duckdb and pgvectorscale Hosting

PostgreSQL 17: Latest major version with enhanced performance
Build Tools: Rust toolchain, PostgreSQL development headers, and build essentials for compiling extensions from source

Deployment Dependencies

pg_duckdb Documentation - DuckDB analytical engine integration
pgvector Documentation - Core vector extension
pgvectorscale Documentation - StreamingDiskANN indexing
DuckDB Extensions - Additional DuckDB capabilities

Implementation Details

Environment Variables:

# PostgreSQL Configuration
POSTGRES_PASSWORD=your_secure_password
POSTGRES_USER=postgres
POSTGRES_DB=your_database

# S3 Configuration (optional, for pg_duckdb)
S3_ACCESS_KEY_ID=your_access_key
S3_SECRET_ACCESS_KEY=your_secret_key
S3_REGION=us-east-1
S3_ENDPOINT=optional_custom_endpoint
S3_URL_STYLE=path

# DuckDB Extensions (optional, comma-separated)
DUCKDB_EXTENSIONS=httpfs,parquet,json

Example Usage - Analytical Queries with pg_duckdb:

-- Query Parquet files from S3
SELECT * FROM read_parquet('s3://bucket/analytics/*.parquet')
WHERE event_date &gt;= '2024-01-01';

-- Aggregate data from remote sources
SELECT
    date_trunc('month', event_time) as month,
    COUNT(*) as events
FROM read_csv('https://example.com/data.csv')
GROUP BY month;

Example Usage - Vector Search with pgvectorscale:

import psycopg2
from pgvector.psycopg2 import register_vector

conn = psycopg2.connect(DATABASE_PUBLIC_URL)
register_vector(conn)

cur = conn.cursor()

# Create table with vector embeddings
cur.execute("""
    CREATE TABLE documents (
        id SERIAL PRIMARY KEY,
        content TEXT,
        embedding VECTOR(1536)
    )
""")

# Create StreamingDiskANN index for scalable search
cur.execute("""
    CREATE INDEX ON documents
    USING diskann (embedding)
""")

# Perform similarity search
cur.execute("""
    SELECT id, content
    FROM documents
    ORDER BY embedding &lt;-&gt; %s
    LIMIT 10
""", (query_embedding,))

Hybrid Query Example:

-- Combine analytical aggregation with vector search
WITH similar_docs AS (
    SELECT id, content
    FROM documents
    ORDER BY embedding &lt;-&gt; '[0.1, 0.2, ...]'::vector
    LIMIT 100
)
SELECT
    date_trunc('day', created_at) as date,
    COUNT(*) as similar_doc_count
FROM similar_docs
JOIN events ON events.doc_id = similar_docs.id
GROUP BY date
ORDER BY date;

Volume Configuration:

Mount a volume at /var/lib/postgresql/data for persistent storage of both analytical data and vector embeddings.

Why Deploy PostgreSQL with pg_duckdb and pgvectorscale on Railway?

Railway is a singular platform to deploy your infrastructure stack. Railway will host your infrastructure so you don't have to deal with configuration, while allowing you to vertically and horizontally scale it.

By deploying PostgreSQL with pg_duckdb and pgvectorscale on Railway, you are one step closer to supporting a complete full-stack application with minimal burden. Host your servers, databases, AI agents, and more on Railway.