
Deploy LlamaIndex Apps
Sample LlamaIndex apps for chatting with PDFs and URL summaries.
chat-with-pdf
Just deployed
summarize-url
Just deployed
Deploy and Host LlamaIndex Apps on Railway
LlamaIndex is an open-source framework for building LLM-powered applications over your own data. This template deploys two Streamlit apps — one for chatting with PDFs using LlamaParse, and one for summarizing URLs using Google Gemini — each demonstrating a core LlamaIndex pattern: document ingestion, indexing, and retrieval-augmented generation (RAG).
About Hosting LlamaIndex Apps
Both apps are single-service Streamlit deployments with no persistent storage or database required. The PDF chat app uses LlamaParse to parse uploaded PDFs and OpenAI to answer queries over the indexed content. The URL summarizer uses Google Gemini to generate summaries of web pages fetched at runtime. API keys are entered via the sidebar at runtime — nothing is stored server-side. Railway handles HTTPS and port binding automatically via the railway.toml start command in each app.
Common Use Cases
- PDF Q&A — upload any PDF and ask natural language questions against its content; LlamaParse handles complex layouts, tables, and multi-column text that standard PDF parsers struggle with, before indexing with OpenAI embeddings for retrieval
- URL summarization — paste any public URL and get a concise 200-250 word summary generated by Gemini 2.5 Flash, useful for quickly digesting articles, documentation, or research papers
- LlamaIndex RAG starter — use either app as a reference implementation for building your own LlamaIndex-powered Streamlit tools on Railway, with the
VectorStoreIndexandSummaryIndexpatterns already wired up
Dependencies for LlamaIndex Apps Hosting
- LlamaIndex (
llama-index,llama-index-core) — core RAG framework - LlamaCloud (
llama-cloud-services) — LlamaParse API for PDF parsing; requires a LlamaCloud API key - OpenAI (
llama-index-llms-openai,llama-index-embeddings-openai) — LLM and embeddings for the PDF chat app; requires an OpenAI API key - Google Gemini (
llama-index-llms-google-genai) — LLM for the URL summarizer app; requires a Google AI Studio API key - No database or persistent volume required
Deployment Dependencies
- LlamaIndex documentation
- LlamaCloud API key
- OpenAI API key
- Google AI Studio API key
- alphasecio/llama-index GitHub repository
- alphasec guide: Chat with PDF using LlamaIndex and LlamaParse
- alphasec guide: Blinkist for URLs with LlamaIndex and Google Gemini
Implementation Details
Chat with PDF — uses LlamaParse from llama-cloud-services to parse the uploaded PDF into markdown, builds a VectorStoreIndex from the parsed content, and queries it with similarity_top_k=5 and tree_summarize response mode. The document is parsed and indexed once per session; subsequent queries reuse the cached index without re-uploading.
Summarize URL — uses SimpleWebPageReader to fetch and extract plain text from the provided URL, builds a SummaryIndex, and queries it with a fixed prompt for a 200-250 word summary using Gemini 2.5 Flash.
API keys are entered via the sidebar at runtime and stored in st.session_state for the duration of the session. They are never persisted to disk or logged.
Why Deploy LlamaIndex Apps on Railway?
Railway is a singular platform to deploy your infrastructure stack. Railway will host your infrastructure so you don't have to deal with configuration, while allowing you to vertically and horizontally scale it.
By deploying LlamaIndex Apps on Railway, you are one step closer to supporting a complete full-stack application with minimal burden. Host your servers, databases, AI agents, and more on Railway.
Template Content
chat-with-pdf
alphasecio/llama-indexsummarize-url
alphasecio/llama-index