Knowledge Sources
Knowledge sources overview
How Gabbex turns your content into accurate answers — and the four source types you can connect.
A Gabbex assistant is only as good as the knowledge you give it. This page explains how knowledge sources work conceptually so the individual source guides make sense.
How knowledge becomes answers
When you connect a source, Gabbex does three things:
- Reads your content. It crawls the URL, reads the PDF, fetches the Notion page, or stores the Q&A entry.
- Splits and indexes it. The text is broken into small chunks and stored in a semantic index. Each chunk gets a vector embedding so the assistant can find it by meaning, not just by keyword.
- Retrieves on demand. When a visitor asks a question, the assistant searches the index for the chunks most relevant to that question and uses them to compose an answer.
This pattern is called retrieval-augmented generation (RAG). The important consequence is that the assistant answers from your content, not from the model’s general knowledge. If the content is missing, the assistant will say it does not know — it will not invent.
The four source types
| Source | Best for | Updates |
|---|---|---|
| Website | Public content already on your site | Re-crawl on demand |
| Files | Manuals, policies, brochures, internal PDFs | Re-upload to update |
| Notion | Teams who already document in Notion | Re-sync on demand |
| Q&A | The exact questions you want exact answers to | Edit anytime |
You can mix as many sources as your plan allows. A typical setup is your main website plus a small list of high-priority Q&A entries.
What makes a good source
- Specific is better than generic. “Standard shipping to the EU is €4.95 and arrives in 3–5 business days” is far more useful than “We ship internationally.”
- One topic per page. Long pages with many topics still work, but focused pages retrieve more cleanly.
- Plain language. Write the way your customers ask questions, not the way your legal team writes contracts.
- Keep it current. A stale page produces a stale answer. Re-crawl or re-sync when you change something important.
What does not belong in a source
- Anything you would not want a customer to see. Sources are not access-controlled by user — if you index it, the assistant can quote from it.
- Sensitive personal data or credentials.
- Pricing pages that change daily without you re-syncing — instead, use a Q&A entry that you can update directly.
Source limits
Each plan has a limit on the total amount of indexed content per assistant. You can see the current usage on the Sources page and on the workspace Usage page. If you hit the limit, remove old sources or upgrade your plan.