Knowledge · RAG glossary
RAG, simply explained
The key terms around Retrieval Augmented Generation, short and without jargon. For everyone who wants to follow along without a computer-science degree.
- RAG (Retrieval Augmented Generation)
- A method where an AI looks things up in your own documents before answering. The answer is grounded in your knowledge, not just the model's training, and can be backed by a source.
- Embedding
- A numeric representation of text that captures meaning. Similar content sits close together in the number space, so the system finds the right thing even when different words were used.
- Chunking
- Splitting documents into smaller passages before making them searchable. How big and how they are cut strongly affects answer quality. More in the article.
- Vector database
- A store for embeddings that quickly finds the passages most similar to a question. Examples are pgvector, Qdrant, and Vespa.ai.
- Retrieval
- The step where the system pulls the matching passages from your documents for a question. The quality of this step decides whether the answer is right.
- Reranking
- A second sorting step that reorders the found passages by true relevance before the AI answers. It pushes the passages that really fit to the top.
- Hybrid search
- The combination of classic keyword search and semantic vector search. Together they find more than either alone: exact terms and related meanings.
- Eval / Ragas
- Systematically measuring answer quality against a test set instead of guessing it. Ragas is a common tool for this.
- Hallucination
- When an AI states something plausible but false. Good RAG lowers the risk because the answer is tied to cited sources.
- Context window
- The amount of text a language model can process at once. It limits how many found passages can feed into a single answer.