
How to Use Knowledge Base (RAG)
What is Knowledge Base?
Knowledge Base is a library of files and URLs from which your AI bot can retrieve information. It helps to provide more accurate and helpful answers. Builders can upload and manage files here.
It uses a framework called Retrieval Augmented Generation (RAG).

What is Retrieval Augmented Generation (RAG)?
Retrieval Augmented Generation (RAG) basically allows LLMs - which are trained on general knowledge from the Internet - to access your organization’s data and knowledge.


Knowledge Base Retrieval Types
Top K: A variable that sets the number of top chunks to be retrieved. A higher 'K' value means more chunks are retrieved, which can improve accuracy but may slow down response time.
Retrieval type: This allows you to determine how information is retrieved from the selection in the dropdown
- 'Chunk' retrieves specific sections that are the most similar to the user prompt.
- 'Neighbor' retrieves related content that is most similar to the user prompt, and retrieves 1 chunk before and 1 chunk after.
- 'Document' retrieves the entire document for context.
When to use which retrieval method?
- Chunk: This is best used when the answer to a user's query might be found in a small section of text within a larger document.
- Neighbor: This is useful when LLM needs more context surrounding the direct answer to the user query.
- Document: This works best when the user query requires LLM to understand the entire documents.
- When experimenting with different retrieval methods, pay attention to quality of output, latency, and context window limit.
Keep Reading
Breakdown of RAG Model Parameters, Settings and Their Impact
Retrieval-Augmented Generation (RAG) is an advanced approach in natural language processing that integrates information retrieval and generative language modeling. Unlike traditional language models that generate responses solely based on their pre-trained knowledge, RAG combines retrieval mechanisms with generative models to enhance the relevance and accuracy of its responses. This hybrid framework works by first retrieving relevant documents or information from a predefined knowledge base (e.g., databases, documents, or PDFs) and then using a generative model (such as a transformer-based model) to synthesize a response that incorporates the retrieved context.