Kendr.org

Kendr hosted vector DB is the managed storage layer for Cloud KB retrieval.

Kendr hosted vector DB stores the embedded chunks created by Cloud Knowledge Bases and exposes them through the KB test, retrieval, artifact, run, and sharing APIs. Product teams can use it without operating a separate vector database, while advanced users still see the model map, embedding dimensions, retrieval scores, and rebuild requirements.

Provider: kendr-cloud Per-KB dimensions Hybrid retrieval Team access policy

What it is

The hosted vector DB is not a separate low-level database product that callers insert arbitrary vectors into directly. It is the managed storage target behind Cloud KBs. Kendr owns extraction, cleaning, chunking, embedding, storage, retrieval, reranking, evaluation, and access enforcement through the /api/kb/cloud contract.

Capability What Kendr exposes Why users care
Managed collections Each Cloud KB stores its own sources, chunks, vector dimensions, active model map, and run state. Teams do not need to provision, tune, or migrate a separate vector database before using RAG.
Inspectable artifacts Extraction, cleaning, chunking, embedding, and storage artifacts can be retained as preview or full JSON. Users can debug bad answers by inspecting the actual material that entered retrieval.
Explainable retrieval Test queries return vector, lexical, rerank, and final scores for selected chunks. Advanced users can see whether retrieval failed because of source quality, chunking, embedding, or reranking.
Access enforcement KB visibility, user grants, team grants, source access metadata, and owner checks are stored with the KB. Shared team KBs can be reused without making every KB automatically visible to every user.

What gets stored

A Cloud KB keeps enough metadata to make retrieval auditable and rebuildable. Clients should depend on the public fields and not on the physical storage engine behind kendr-cloud.

1
KB metadata
Name, status, owner, pipeline version, embedding dimensions, active run id, active model map, and access policy.
2
Sources and chunks
Source records, chunk text, source references, page references, token or word counts, section path, and diagnostics.
3
Embeddings
Chunk embeddings generated by the configured provider and model, stored with a known dimension count.
4
Operations history
Pipeline runs, retained artifacts, evaluations, warnings, timings, usage events, and rebuild status.

Retrieval contract

Retrieval happens through the Cloud KB test and answer flow. The API can run vector, lexical, or hybrid matching, then apply a visible reranker mode before returning selected chunks and an optional answer.

Signal Returned field Meaning
Vector match vector_score How close the query embedding is to the chunk embedding after normalization.
Lexical match lexical_score How much exact or token-level query language overlaps with the chunk text.
Reranker rerank_score Additional ordering signal from the configured reranker or visible heuristic fallback.
Final order final_score The combined score used to order returned chunks for answer generation.
Diagnostics diagnostics Warnings or notes about chunk quality, metadata, source spread, and retrieval behavior.

Dimensions and rebuilds

Vector dimensions are part of the KB metadata because retrieval only works when query embeddings and stored chunk embeddings use the same dimensionality. If the embedding model or dimension count changes, clients should trigger a rebuild from the embedding stage or earlier.

Rule of thumb

Cleaning or chunking changes require rechunking and reembedding. Embedding provider, model, or dimension changes require reembedding and vector-store replacement. Retrieval weight changes can be tested without rebuilding stored chunks.

await client.rebuildCloudKnowledgeBase(kbId, {
  from_stage: 'embeddings',
  pipeline_config: {
    embeddings: {
      provider: 'kendr-managed',
      model: 'kendr-managed-hash-embedding-v1',
      dimensions: 128
    },
    vector_store: { provider: 'kendr-cloud' }
  }
});

Sharing and security

Hosted vectors inherit the Cloud KB access policy. Sharing happens at the KB and source-access level, not by handing users a raw vector index. This keeps the product simple for teams and leaves room for stricter policy enforcement as organization features mature.

  • visibility: private keeps the KB owner-scoped.
  • visibility: team allows team access with explicit role metadata.
  • User and team grants can be replaced through POST /api/kb/cloud/{kb_id}/access.
  • Source access metadata records which sources are inherited, restricted, or blocked from shared use.

Billing signals

Hosted vector DB usage is billed through Cloud KB events rather than a separate vector API meter. Kendr records credits for indexing, storage estimates, query retrieval, reranking, answer generation, and rebuilds. Use the estimate endpoint before creating or rebuilding large KBs.

const estimate = await client.estimateCloudKnowledgeBase({
  sources: [
    { kind: 'file', label: 'Runbook', value: 'runbook.md', content: largeMarkdown }
  ],
  pipeline_config: {
    chunking: { strategy: 'semantic', target_words: 180 },
    vector_store: { provider: 'kendr-cloud' }
  }
});

console.log(estimate.estimated_chunks, estimate.estimated_credits);

Current boundaries

The public contract is intentionally higher level than a raw vector database. Today, callers create and query vectors through Cloud KBs. Direct arbitrary vector upsert, namespace-level vector search outside a KB, bring-your-own-index replication, and raw index export are not public API surfaces yet.

Implementation detail

Clients should treat kendr-cloud as the durable provider name and rely on OpenAPI, SDK methods, model maps, and returned score fields. Kendr can evolve the physical vector engine behind that provider without changing the customer-facing KB contract.

Where to go next