Knowledge · RAG glossary

RAG, simply explained

The key terms around Retrieval Augmented Generation, short and without jargon. For everyone who wants to follow along without a computer-science degree.

RAG (Retrieval Augmented Generation)
A method where an AI looks things up in your own documents before answering. The answer is grounded in your knowledge, not just the model's training, and can be backed by a source.
Embedding
A numeric representation of text that captures meaning. Similar content sits close together in the number space, so the system finds the right thing even when different words were used.
Chunking
Splitting documents into smaller passages before making them searchable. How big and how they are cut strongly affects answer quality. More in the article.
Vector database
A store for embeddings that quickly finds the passages most similar to a question. Examples are pgvector, Qdrant, and Vespa.ai.
Retrieval
The step where the system pulls the matching passages from your documents for a question. The quality of this step decides whether the answer is right.
Reranking
A second sorting step that reorders the found passages by true relevance before the AI answers. It pushes the passages that really fit to the top.
Eval / Ragas
Systematically measuring answer quality against a test set instead of guessing it. Ragas is a common tool for this.
Hallucination
When an AI states something plausible but false. Good RAG lowers the risk because the answer is tied to cited sources.
Context window
The amount of text a language model can process at once. It limits how many found passages can feed into a single answer.