← AI Glossary

What Is RAG (Retrieval-Augmented Generation)?

RAG (Retrieval-Augmented Generation) is a technique that combines a large language model with an external knowledge base. Instead of relying solely on what the model memorized during training, RAG retrieves relevant documents and feeds them into the prompt.

How RAG Works

  1. Query: The user asks a question
  2. Retrieve: The system searches a knowledge base (documents, databases, APIs) for relevant information
  3. Augment: The retrieved content is injected into the model’s prompt as context
  4. Generate: The model answers using both its training knowledge and the retrieved documents

Why RAG Matters

  • Reduces hallucinations: The model cites real documents instead of making things up
  • Stays current: The knowledge base can be updated without retraining the model
  • Domain-specific: Works with your own private data — internal docs, codebases, research papers
  • Cost-effective: Cheaper than fine-tuning a model on your data

RAG vs. Fine-Tuning

AspectRAGFine-Tuning
Setup costLowHigh
Data freshnessReal-timeStatic (training snapshot)
AccuracyHigh (cites sources)Variable
Best forFacts, docs, Q&AStyle, tone, specialized tasks

RAG in Practice

Many AI applications use RAG behind the scenes — from customer support bots that search help docs to coding assistants that reference your codebase. Elvean’s MCP server support enables RAG-like workflows by connecting models to external data sources.

Elvean brings all these concepts together in one native Mac app — local models, cloud APIs, agentic tools, and more.

Learn more about Elvean