Knowledge Base
Train your AI agent on your content by uploading files, crawling websites, and adding custom knowledge to make your agent smarter and more helpful.
What is a Knowledge Base?
A Knowledge Base is where you store all the information your AI agent uses to answer questions. Upload your documentation, FAQs, product guides, support articles, or any content you want your agent to learn from.
The more relevant content you add, the smarter and more accurate your agent becomes at helping your visitors.
Creating a Knowledge Base
Navigate to Knowledge Base in the left sidebar and click Create Knowledge Base. You'll need to provide a Name like Product Documentation or Help Center Content, and a Description explaining what content this knowledge base contains, such as "Product features and technical specifications". These help you organize and identify multiple knowledge bases later. Click Create and you're ready to add content.
Adding Content
There are two ways to add content to your knowledge base:
Crawl Website
Let OpenSpeechAI automatically crawl your website and extract content from multiple pages. Click Add URLs to open the ingestion panel where you can add content in two ways:
Sitemap Import - Enter your sitemap URL (e.g., https://yoursite.com/sitemap.xml) and OpenSpeechAI will automatically fetch all publicly available pages. Most websites auto-generate sitemaps at /sitemap.xml. Shopify, WordPress, and documentation platforms typically create these automatically. You can review and select which pages to ingest.
Individual URLs - Paste a single webpage URL to add just that page to your knowledge base.
What to Ingest - Focus on documentation pages, knowledge bases, help centers, API references, and technical guides for best results.
What to Avoid - Skip blogs, news articles, marketing pages, and time-sensitive content to keep your knowledge base clean and factual.
Dynamically Rendered Content - Some websites render content with JavaScript, which may not be captured during ingestion. For such sites, use the PDF or TXT file upload option instead.
Bot Protection & Access Issues - Sites with Cloudflare or security services may block automated ingestion. To allow OpenSpeechAI access, add our service to your site's whitelist or use the file upload option instead.
Recrawling - If you update your website content, you can refresh the knowledge base with the latest information. In the files table, click the three-dot menu next to the URL and select Recrawl.
Upload Files
Upload documents directly from your computer to train your agent. Click Upload Files and select your documents - they'll be processed automatically. Large files may take a few moments to process.
Supported Formats - Currently, OpenSpeechAI supports PDF and TXT files. PDFs work best for documentation, guides, manuals, and ebooks, while TXT files are perfect for plain text content, notes, and simple documentation.
OCR Limitation - Scanned PDFs or image-based PDFs are not supported at this time. Make sure your PDFs contain actual text, not scanned images. If you have scanned documents, consider converting them to text format first.
Need More Format Support? - If you need support for Word documents (.doc, .docx), Markdown (.md), or other file formats, reach out to our customer support team and let us know.
Best Practices
Keep content focused - Only add information relevant to what visitors might ask. Avoid unrelated content that could confuse your agent.
Use clear language - Write in plain, simple language. The clearer your source content, the better your agent will explain it.
Update regularly - When your products, services, or documentation change, update your knowledge base so your agent stays accurate.
Organize with multiple knowledge bases - If you have distinct topics (e.g., product docs vs. billing FAQs), create separate knowledge bases and connect different agents to each.
Test after adding content - After uploading new content, test your agent in the Preview tab to make sure it understands and uses the new information correctly.
Managing Content
View Files
Your knowledge base has two tabs - URLs for crawled website content and Files for uploaded documents. Each tab displays a table showing all content added to your knowledge base. Click any item to preview its content.
Delete Content
Remove outdated or incorrect content by clicking the delete icon next to any file or page. Your agent will stop using this information immediately.
Recrawl Websites
If you've crawled a website and updated its content, click Recrawl to fetch the latest version. The old content will be replaced with the new.
Connecting to Agents
After creating and filling your knowledge base, connect it to an agent:
- Go to the Agents page
- Click on the agent you want to update
- In the agent detail page, click the Edit icon
- In the Knowledge Base dropdown, select the knowledge base you want to use
- Click Save to apply the changes
Your agent will now use this knowledge base to answer questions.
Tips for Better Results
Include examples - If your documentation includes examples, your agent will provide better, more specific answers.
Add FAQs - Upload a FAQ document. Agents excel at matching visitor questions to pre-written answers.
Keep files under 10MB - Smaller files process faster. If you have large PDFs, consider splitting them into sections.
Use descriptive titles - When adding text manually or uploading files, use clear titles that describe the content. This helps you manage your knowledge base later.
Last updated: November 18, 2024