Knowledge · RAG glossary

RAG, simply explained

The key terms around Retrieval Augmented Generation, short and without jargon. For everyone who wants to follow along without a computer-science degree.

RAG (Retrieval Augmented Generation): A method where an AI looks things up in your own documents before answering. The answer is grounded in your knowledge, not just the model's training, and can be backed by a source.
Embedding: A numeric representation of text that captures meaning. Similar content sits close together in the number space, so the system finds the right thing even when different words were used.
Chunking: Splitting documents into smaller passages before making them searchable. How big and how they are cut strongly affects answer quality. More in the article.
Vector database: A store for embeddings that quickly finds the passages most similar to a question. Examples are pgvector, Qdrant, and Vespa.ai.
Retrieval: The step where the system pulls the matching passages from your documents for a question. The quality of this step decides whether the answer is right.
Reranking: A second sorting step that reorders the found passages by true relevance before the AI answers. It pushes the passages that really fit to the top.
Hybrid search: The combination of classic keyword search and semantic vector search. Together they find more than either alone: exact terms and related meanings.
Eval / Ragas: Systematically measuring answer quality against a test set instead of guessing it. Ragas is a common tool for this.
Hallucination: When an AI states something plausible but false. Good RAG lowers the risk because the answer is tied to cited sources.
Context window: The amount of text a language model can process at once. It limits how many found passages can feed into a single answer.