What Is RAG (Retrieval-Augmented Generation)?
RAG — Retrieval-Augmented Generation — is an AI architecture that combines information retrieval with language model generation. Rather than asking an LLM to answer a question from its training data alone, a RAG system first searches a connected knowledge base, retrieves the most relevant information, and provides that information to the model as context. The model then generates a response grounded in what was retrieved, not in memory alone.
The key word is 'grounded.' RAG changes the LLM's job from 'recall the answer from training' to 'synthesize an answer from the provided facts.' This seemingly simple architectural shift resolves the most critical problem with deploying LLMs in enterprise settings: hallucination.
RAG is now the standard architecture for enterprise AI applications that require accuracy, auditability, and access to current or proprietary information.