Railway

Deploy Qdrant | High-Performance Vector Databse

Self-Host Qdrant - Semantic Search, Recommendation Systems, and RAG

Deploy Qdrant | High-Performance Vector Databse

Just deployed

/qdrant/storage

Qdrant Logo

Deploy and Host Qdrant

Deploy a self-hosted Qdrant vector database on Railway in one click. This template provisions the official qdrant/qdrant Docker image with persistent storage pre-configured at /qdrant/storage, so your vector data survives restarts without any manual setup.

For quickstart refer following guide - Guide

To access the dashboard, go to your-railway-url.app/dashboard: Dashboard

About Hosting Qdrant

Qdrant is an open-source, high-performance vector database and similarity search engine written in Rust. It's purpose-built for storing, indexing, and querying high-dimensional vector embeddings — the kind generated by OpenAI, Cohere, or sentence-transformers.

Key features:

  • HNSW indexing for sub-millisecond ANN (approximate nearest neighbour) search
  • Rich payload filtering — combine vector search with structured metadata conditions
  • Scalar, Product, and Binary Quantization — reduce memory usage up to 40x
  • REST and gRPC APIs with OpenAPI v3 spec; official clients for Python, TypeScript, Rust, Go
  • Apache 2.0 licensed — no vendor lock-in

Why Deploy Qdrant on Railway

Railway handles the infrastructure so you can focus on building. Compared to managing Qdrant on a raw VPS, Railway gives you automatic TLS, environment variable management, volume persistence, and one-click redeployments — all from a clean UI. Private networking means your app services can reach Qdrant internally without exposing it publicly. The free tier is enough to prototype, and scaling up is a slider, not a support ticket.

Common Use Cases

  • Retrieval-Augmented Generation (RAG): Store document embeddings and retrieve context for LLM prompts with LangChain or LlamaIndex
  • Semantic search: Power search that understands meaning, not just keywords — across products, articles, or support docs
  • Recommendation systems: Use Qdrant's Recommendation API to surface personalised suggestions based on vector similarity
  • AI agents: Give agents long-term memory by persisting and querying embeddings across sessions
  • Anomaly detection: Compare incoming data vectors against known-good baselines in security or monitoring pipelines

Dependencies for Qdrant

Qdrant is self-contained — no external database or broker required.

  • Your application service (any language) connecting to Qdrant via REST or gRPC

Environment Variables Reference

VariableDescriptionRequired
PORTHTTP REST API port Qdrant listens on. Defaults to 6333.Yes

Deployment Dependencies

  • Docker image: qdrant/qdrant (GitHub: qdrant/qdrant)
  • Persistent volume mounted at /qdrant/storage — already configured in this template
  • No other services required

Getting Started with Qdrant After Deployment

Once your Railway deployment is live, the REST API is available at your Railway-assigned public URL on port 6333. Test it immediately:

curl https://your-app.railway.app/collections \
  -H "api-key: YOUR_API_KEY"
# Returns: {"result":{"collections":[]},"status":"ok"}

Create your first collection and insert a vector using the Python client:

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(
    url="https://your-app.railway.app",
    api_key="YOUR_API_KEY"
)

client.create_collection(
    collection_name="docs",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

client.upsert(
    collection_name="docs",
    points=[PointStruct(id=1, vector=[0.1] * 1536, payload={"text": "hello"})]
)

Install with pip install qdrant-client. From here, connect your embedding pipeline and start querying.


Hardware Requirements for Qdrant

Qdrant's memory usage scales with your vector dimensions and collection size. As a baseline:

  • Development / prototyping: 1 vCPU, 512MB RAM — fine for collections under 100k vectors
  • Small production workloads: 2 vCPU, 2–4GB RAM — handles millions of 768–1536-dimension vectors with quantization enabled
  • High-throughput production: 4+ vCPU, 8GB+ RAM — for unquantized large collections or heavy concurrent query loads

Enable Binary or Scalar Quantization to cut RAM requirements by 4–40x on large datasets.


Qdrant vs Competitors

FeatureQdrantPineconeWeaviateChromapgvector
Open source✅ Apache 2.0❌ Proprietary✅ BSD✅ Apache 2.0✅ PostgreSQL ext.
Self-hostable
Written inRustProprietaryGoPythonC
Payload filteringNative, fastBasicGraphQLPython-sideSQL WHERE
QuantizationScalar, Product, BinaryPQ
gRPC support

Qdrant vs Pinecone: Pinecone is fully managed and easier to start with, but costs scale fast and you have no control over your data. Self-hosted Qdrant on Railway gives equivalent search quality at a fraction of the cost.

Qdrant vs pgvector: If you're already on Postgres and have under 1M vectors with simple filtering needs, pgvector works. For dedicated vector workloads, larger collections, or quantization, Qdrant is significantly faster.


Self-Hosting Qdrant (Outside Railway)

To run Qdrant on your own machine or VPS using Docker:

docker pull qdrant/qdrant

docker run -d -p 6333:6333 \
  -e QDRANT__SERVICE__API_KEY=your-secret-key \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant

Access the REST API at http://localhost:6333. Data persists in ./qdrant_storage. For production VPS deployments, wrap this in Docker Compose and add a reverse proxy with TLS.


Is Qdrant Free to Use?

Qdrant is fully open source under the Apache 2.0 license — free to self-host with no usage limits or feature gates. Qdrant Cloud (the managed offering) has a free tier with limited capacity; paid plans start around $25/month. On Railway, you pay only for the infrastructure you use — typically a few dollars per month for small workloads.


FAQ

Is the setup process complicated? No. This Railway template requires minimal configuration — deploy, optionally set an API key, and your Qdrant instance is live. The volume is pre-mounted so data persistence works out of the box.

How do I run Qdrant with Docker locally? Pull the image with docker pull qdrant/qdrant, then run it with port mapping and a volume for persistence: docker run -p 6333:6333 -v $(pwd)/qdrant_storage:/qdrant/storage qdrant/qdrant. The API is then available at localhost:6333.

What is Qdrant used for? Qdrant powers advanced semantic search, recommendation systems, retrieval-augmented generation (RAG), data analysis, anomaly detection, and AI agent memory — any use case that involves finding similar vectors in high-dimensional space.

Should I use Qdrant Cloud or self-host? Qdrant Cloud is convenient but expensive at scale. Self-hosting on Railway gives you full control over your data, no vendor lock-in, and predictable infrastructure costs — with far less operational overhead than managing your own VPS.

How do I secure my Qdrant instance? Set the QDRANT__SERVICE__API_KEY environment variable before your deployment goes public. All client requests must then include an api-key header. On Railway, you can also restrict access to internal private networking so only your app services can reach Qdrant directly.


Template Content

More templates in this category

View Template
Postgres Backup to Cloudflare R2 (S3-Compatible)
Automated PostgreSQL backups to S3-compatible storage with encryption

Aman
View Template
ReadySet
A lightweight caching engine for Postgres

Milo
View Template
Simple S3
Deploy a S3-compatible storage service with a pre-named bucket.

Joey Chilson