Skip to main content

Overview

Knowledge Bases help you make unstructured information more usable inside AI workflows. A Knowledge Base lets users upload information (in the form of documents). split them in smaller parts, convert them into embeddings, and store them in a vector database.  Once processed, this content can be retrieved as and when required by AI workflows. A Knowledge Base:
  • Stores uploaded files
  • Manages how these files are processed
  • Manages storage of processed data
  • Helps contextual retrieval
This approach gives you flexibility to use external storage if required, and removes the need to manage additional databases.

What a Knowledge Base Includes

Each Knowledge Base combines configuration, storage, and content into a single unit:
  • File ingestion - Upload documents directly into the platform
  • Chunking logic - Break documents into smaller, overlapping segments to get better context
  • Embedding configuration - Define how text is converted into vectors
  • Vector storage - Store embeddings in a hosted or customer-managed database
  • Retrieval interface - Make the content available to workflows via semantic search or queries

Supported Knowledge Base Types

AI Squared supports two Knowledge Base types:
  1. Vector Store
    Designed for semantic search and retrieval. Uploaded documents are embedded and stored as vectors, enabling similarity-based queries.
  2. Semantic Data Model
    Designed for non-vector or structured retrieval scenarios, where queries are executed directly against a connected database.
The selected type determines how data is processed and how retrieval queries are executed during workflow runs.

Creating a Knowledge Base

When creating a Knowledge Base, users define how their documents should be processed and stored. Some of the key configuration options are:
  • Embedding provider - Service used to generate embeddings (default: OpenAI)
  • Embedding model - Specific model used for vector generation
  • Chunk size - Maximum size of each document chunk
  • Chunk overlap - Overlap between chunks to preserve context
  • Vector storage - Choose between a hosted vector store or an external database
  • Storage schema - Define which database columns store vectors, text, and metadata
Once saved, the Knowledge Base becomes ready to accept file uploads.
Creating Knowledge Base

File Upload and Processing

Uploading a file to a Knowledge Base triggers an automated processing pipeline:
  1. The file is uploaded and verified
  2. Content is extracted and split into smaller chunks
  3. Each chunk is converted into an embedding
  4. Embeddings, text, and metadata are written to the selected storage
  5. File status is updated once processing completes
Chunking & Processing
This entire process runs asynchronously, allowing large files to be handled without blocking the user experience.
File Upload

Updating a Knowledge Base

Knowledge Bases cannot be changed once they are created. If you need to make changes in the configuration, you must create a new Knowledge Base and re-upload files with the updated settings.

Deleting Knowledge Bases and Files

Knowledge Bases and their files can be safely deleted when they are no longer in use. When individual files are deleted, the embeddings associated with that file are also removed.

Retrieval and Usage in Workflows

Knowledge Bases are designed in a way that they can be used directly by AI workflows. Retrieval is designed to ensure that AI responses are grounded in authoritative, relevant, and permitted data, rather than relying on general model knowledge. When a query is received:
  • Content is retrieved based on meaning allowing workflows to work effectively with unstructured information
  • Semantic retrieval can be combined with simple keyword matching and metadata filters to improve accuracy and recall
  • Access controls and permissions are enforced, ensuring users only retrieve data they are allowed to see
  • Retrieved results can include references back to the original documents or records, maintaining traceability
This ensures workflows always retrieve relevant information. 

Knowledge Bases with agents

When used within agent-driven workflows, Knowledge Bases can be invoked dynamically as part of the agent’s reasoning process. This allows agents to decide when retrieval is needed, query the Knowledge Base with the right context, and use the returned information to guide multi-step reasoning. Even in these cases, retrieval remains controlled and predictable - agents can only access Knowledge Bases explicitly connected to the workflow, and all retrieved content is returned in a structured, auditable form.

Why are Knowledge Bases Important

Knowledge Bases help translate raw data into data that is usable for AI workflows.  They help teams to:
  • Bring unstructured knowledge into AI systems
  • Maintain clean separation between data, configuration, and workflows
  • Ensure consistent, compliant access to knowledge within the company
Knowledge Bases turn static documents into usable knowledge inside AI Squared.