Knowledge Bases

Give your agents access to your documentation, websites, and custom content.

Overview

Knowledge Bases let you store and retrieve information that your HUMA agents can access. Instead of hardcoding information into agent instructions, you can query a knowledge base for relevant context based on the user's question.

Store Content

Import documentation, help articles, product info, or any text content

Semantic Search

Find relevant content based on meaning, not just keywords

Agent Integration

Connect to HUMA-0.1 agents for context-aware responses

Use Case: A support agent that answers questions using your help documentation, or a sales agent that knows your product catalog.

Create a Knowledge Base

Create a knowledge base to store your content. You can create multiple knowledge bases for different purposes.

API Endpoint

POST /api/knowledge-bases
Content-Type: application/json
X-API-Key: your_api_key

Request Body

{
  "name": "Product Documentation",
  "description": "Help docs and product guides",
  "tags": ["docs", "support"]
}

Response

{
  "id": "cmk123abc456",
  "name": "Product Documentation",
  "description": "Help docs and product guides",
  "tags": ["docs", "support"],
  "status": "active",
  "documentCount": 0,
  "createdAt": "2025-01-13T10:30:00.000Z"
}

Import Content with Web Crawler

The Web Crawler automatically discovers and imports content from websites. It's perfect for importing documentation sites, help centers, or any public web content.

How It Works

1

Discover

Crawler finds all pages on your website

2

Deduplicate

AI removes duplicate pages (locale variations, query params)

3

Scrape

Content is extracted from each page

4

Import

Pages are added to your knowledge base

Start a Web Crawler

POST /api/intakes
Content-Type: application/json
X-API-Key: your_api_key

{
  "knowledgeBaseId": "cmk123abc456",
  "type": "WEB_CRAWLER",
  "name": "Docs Crawler",
  "configuration": {
    "url": "https://docs.example.com"
  }
}

Response

{
  "id": "cmi789xyz123",
  "type": "WEB_CRAWLER",
  "name": "Docs Crawler",
  "status": "pending",
  "message": "Intake created, pipeline started."
}

Check Progress

Poll the intake endpoint to track progress:

GET /api/intakes/cmi789xyz123
X-API-Key: your_api_key

// Response while running:
{
  "id": "cmi789xyz123",
  "status": "scraping",
  "progress": {
    "pagesFound": 150,
    "pagesAfterLlmFilter": 142,
    "pagesAfterFilter": 4,
    "pagesScraped": 2,
    "pagesIngested": 0
  }
}

// Response when complete:
{
  "id": "cmi789xyz123",
  "status": "completed",
  "progress": {
    "pagesFound": 150,
    "pagesAfterLlmFilter": 142,
    "pagesAfterFilter": 4,
    "pagesScraped": 4,
    "pagesIngested": 4
  },
  "documentIds": ["doc1", "doc2", "doc3", "doc4"]
}

Query Your Knowledge Base

Search your knowledge base using natural language. The API returns the most relevant content chunks based on semantic similarity.

Query Endpoint

POST /api/knowledge-bases/{id}/query
Content-Type: application/json
X-API-Key: your_api_key

{
  "query": "How do I reset my password?",
  "topK": 3
}

Response

{
  "kbId": "cmk123abc456",
  "kbName": "Product Documentation",
  "query": "How do I reset my password?",
  "chunks": [
    {
      "content": "To reset your password, click the 'Forgot Password' link on the login page. Enter your email address and we'll send you a reset link...",
      "source": "https://docs.example.com/account/password-reset",
      "score": 0.92
    },
    {
      "content": "Account security settings allow you to change your password, enable two-factor authentication...",
      "source": "https://docs.example.com/account/security",
      "score": 0.78
    }
  ]
}

Connect to HUMA-0.1 Agent

Connect knowledge bases to your agent using the addons configuration. The server automatically generates search tools and handles all queries - no client-side implementation needed.

Configure Agent with Knowledge Bases

When creating your agent, add the addons field to enable knowledge base search:

POST /api/agents
Content-Type: application/json
X-API-Key: your_api_key

{
  "name": "Support Agent",
  "agentType": "HUMA-0.1",
  "metadata": {
    "className": "Support Agent",
    "personality": "You are a helpful support agent...",
    "instructions": "Help users with their questions. Use the knowledge base search tools to find relevant information before answering.",
    "routerType": "llm-judge",
    "tools": [
      {
        "name": "send_message",
        "description": "Send a message to the user",
        "parameters": [
          { "name": "message", "type": "string", "required": true }
        ]
      }
    ],
    "addons": {
      "enabled": ["knowledge-base"],
      "knowledgeBases": [
        { "id": "cmk123abc456", "name": "Help Docs" },
        { "id": "cmk789xyz012", "name": "Product FAQ" }
      ]
    }
  }
}

What Happens Automatically

1

Tools Generated

Server creates kb_search_cmk123abc456 and kb_search_cmk789xyz012 tools

2

Agent Sees Tools

The AI sees tools like "Search the Help Docs knowledge base for relevant information"

3

Server Handles Calls

When the agent calls a KB tool, the server queries LlamaCloud and returns results - your client does nothing

Example Conversation

When a user asks a question, the agent automatically searches the knowledge base:

U

How do I reset my password?

🤖

(Agent calls kb_search_cmk123abc456 with query "reset password" → server handles it automatically)

🤖

To reset your password, click the "Forgot Password" link on the login page. You'll receive an email with a reset link. The link expires after 24 hours for security.

Zero Client Code: Unlike custom tools that require client-side handling, knowledge base tools are fully server-side. Your client only handles its own tools (like send_message).

Best Practices

Content Organization

  • Create separate KBs for different topics (docs, FAQs, products)
  • Use descriptive names and tags
  • Keep content up to date by re-crawling periodically

Web Crawler Tips

  • Start with documentation subdomains (docs.example.com)
  • Delete old intakes before re-crawling to avoid duplicates
  • Check progress to ensure pages are being found

Search Quality

  • Server returns top 5 most relevant chunks automatically
  • Results include source URLs for the agent to cite
  • Chunks are scored by semantic similarity (0-1)

Agent Configuration

  • Use the addons field - no client code needed
  • Give each KB a descriptive name for better tool descriptions
  • In instructions, tell the agent to search KB before answering

API Reference

EndpointMethodDescription
/api/knowledge-basesGETList all knowledge bases
/api/knowledge-basesPOSTCreate a knowledge base
/api/knowledge-bases/:idGETGet knowledge base details
/api/knowledge-bases/:idDELETEDelete a knowledge base
/api/knowledge-bases/:id/queryPOSTQuery knowledge base
/api/intakesGETList intakes (web crawlers)
/api/intakesPOSTCreate an intake
/api/intakes/:idGETGet intake status
/api/intakes/:idDELETEDelete intake and its documents