Knowledge Bases

Give your agents access to your documentation, websites, and custom content.

Overview

Knowledge Bases let you store and retrieve information that your HUMA agents can access. Instead of hardcoding information into agent instructions, you can query a knowledge base for relevant context based on the user's question.

Store Content

Import documentation, help articles, product info, or any text content

Semantic Search

Find relevant content based on meaning, not just keywords

Agent Integration

Connect to HUMA-0.1 agents for context-aware responses

Use Case: A support agent that answers questions using your help documentation, or a sales agent that knows your product catalog.

Create a Knowledge Base

Create a knowledge base to store your content. You can create multiple knowledge bases for different purposes.

API Endpoint

POST /api/knowledge-bases
Content-Type: application/json
X-API-Key: your_api_key

Request Body

{
  "name": "Product Documentation",
  "description": "Help docs and product guides",
  "tags": ["docs", "support"]
}

Response

{
  "id": "cmk123abc456",
  "name": "Product Documentation",
  "description": "Help docs and product guides",
  "tags": ["docs", "support"],
  "status": "active",
  "documentCount": 0,
  "createdAt": "2025-01-13T10:30:00.000Z"
}

Import Content with Web Crawler

The Web Crawler automatically discovers and imports content from websites. It's perfect for importing documentation sites, help centers, or any public web content.

How It Works

Discover

Crawler finds all pages on your website

Deduplicate

AI removes duplicate pages (locale variations, query params)

Scrape

Content is extracted from each page

Import

Pages are added to your knowledge base

Start a Web Crawler

POST /api/intakes
Content-Type: application/json
X-API-Key: your_api_key

{
  "knowledgeBaseId": "cmk123abc456",
  "type": "WEB_CRAWLER",
  "name": "Docs Crawler",
  "configuration": {
    "url": "https://docs.example.com"
  }
}

Response

{
  "id": "cmi789xyz123",
  "type": "WEB_CRAWLER",
  "name": "Docs Crawler",
  "status": "pending",
  "message": "Intake created, pipeline started."
}

Check Progress

Poll the intake endpoint to track progress:

GET /api/intakes/cmi789xyz123
X-API-Key: your_api_key

// Response while running:
{
  "id": "cmi789xyz123",
  "status": "scraping",
  "progress": {
    "pagesFound": 150,
    "pagesAfterLlmFilter": 142,
    "pagesAfterFilter": 4,
    "pagesScraped": 2,
    "pagesIngested": 0
  }
}

// Response when complete:
{
  "id": "cmi789xyz123",
  "status": "completed",
  "progress": {
    "pagesFound": 150,
    "pagesAfterLlmFilter": 142,
    "pagesAfterFilter": 4,
    "pagesScraped": 4,
    "pagesIngested": 4
  },
  "documentIds": ["doc1", "doc2", "doc3", "doc4"]
}

Query Your Knowledge Base

Search your knowledge base using natural language. The API returns the most relevant content chunks based on semantic similarity.

Query Endpoint

POST /api/knowledge-bases/{id}/query
Content-Type: application/json
X-API-Key: your_api_key

{
  "query": "How do I reset my password?",
  "topK": 3
}

Response

{
  "kbId": "cmk123abc456",
  "kbName": "Product Documentation",
  "query": "How do I reset my password?",
  "chunks": [
    {
      "content": "To reset your password, click the 'Forgot Password' link on the login page. Enter your email address and we'll send you a reset link...",
      "source": "https://docs.example.com/account/password-reset",
      "score": 0.92
    },
    {
      "content": "Account security settings allow you to change your password, enable two-factor authentication...",
      "source": "https://docs.example.com/account/security",
      "score": 0.78
    }
  ]
}

Connect to HUMA-0.1 Agent

Connect knowledge bases to your agent using the addons configuration. The server automatically generates search tools and handles all queries - no client-side implementation needed.

Configure Agent with Knowledge Bases

When creating your agent, add the addons field to enable knowledge base search:

POST /api/agents
Content-Type: application/json
X-API-Key: your_api_key

{
  "name": "Support Agent",
  "agentType": "HUMA-0.1",
  "metadata": {
    "className": "Support Agent",
    "personality": "You are a helpful support agent...",
    "instructions": "Help users with their questions. Use the knowledge base search tools to find relevant information before answering.",
    "modelPackage": "gemini-2.5-flash",
    "routerType": "llm-judge",
    "tools": [
      {
        "name": "send_message",
        "description": "Send a message to the user",
        "parameters": [
          { "name": "message", "type": "string", "required": true }
        ]
      }
    ],
    "addons": {
      "enabled": ["knowledge-base"],
      "knowledgeBases": [
        { "id": "cmk123abc456", "name": "Help Docs" },
        { "id": "cmk789xyz012", "name": "Product FAQ" }
      ]
    }
  }
}

What Happens Automatically

Tools Generated

Server creates kb_search_cmk123abc456 and kb_search_cmk789xyz012 tools

Agent Sees Tools

The AI sees tools like "Search the Help Docs knowledge base for relevant information"

Server Handles Calls

When the agent calls a KB tool, the server queries LlamaCloud and returns results - your client does nothing

Example Conversation

When a user asks a question, the agent automatically searches the knowledge base:

How do I reset my password?

🤖

(Agent calls kb_search_cmk123abc456 with query "reset password" → server handles it automatically)

🤖

To reset your password, click the "Forgot Password" link on the login page. You'll receive an email with a reset link. The link expires after 24 hours for security.

Zero Client Code: Unlike custom tools that require client-side handling, knowledge base tools are fully server-side. Your client only handles its own tools (like send_message).

Best Practices

Content Organization

Create separate KBs for different topics (docs, FAQs, products)
Use descriptive names and tags
Keep content up to date by re-crawling periodically

Web Crawler Tips

Start with documentation subdomains (docs.example.com)
Delete old intakes before re-crawling to avoid duplicates
Check progress to ensure pages are being found

Search Quality

Server returns top 5 most relevant chunks automatically
Results include source URLs for the agent to cite
Chunks are scored by semantic similarity (0-1)

Agent Configuration

Use the addons field - no client code needed
Give each KB a descriptive name for better tool descriptions
In instructions, tell the agent to search KB before answering

API Reference

Endpoint	Method	Description
/api/knowledge-bases	GET	List all knowledge bases
/api/knowledge-bases	POST	Create a knowledge base
/api/knowledge-bases/:id	GET	Get knowledge base details
/api/knowledge-bases/:id	DELETE	Delete a knowledge base
/api/knowledge-bases/:id/query	POST	Query knowledge base
/api/intakes	GET	List intakes (web crawlers)
/api/intakes	POST	Create an intake
/api/intakes/:id	GET	Get intake status
/api/intakes/:id	DELETE	Delete intake and its documents