OSS-first docs

These docs teach the open system first: contracts, generated surfaces, runtimes, governance, and incremental adoption. Studio shows up as the operating layer on top, not as the source of truth.

AI index

Knowledge Binding

Knowledge binding connects your app's workflows and agents to structured knowledge spaces. This enables semantic search, RAG (Retrieval-Augmented Generation), and context-aware decision-making.

How it works

Knowledge binding follows a three-layer model:

  1. KnowledgeSpaceSpec (global) - Defines a logical knowledge domain
  2. KnowledgeSourceConfig (per-tenant) - Tenant's data sources feeding spaces
  3. AppKnowledgeBinding (per-app) - Maps spaces to workflows/agents

Example: Support agent with RAG

Let's build a support agent that uses canonical product documentation and operational support history.

Step 1: Blueprint declares knowledge needs

// AppBlueprintSpec
{
  id: "support-app",
  version: "1.0.0",
  knowledgeSpaces: [
    {
      spaceId: "product-canon",
      category: "canonical",
      required: true,
      purpose: "Official product documentation and specs"
    },
    {
      spaceId: "support-history",
      category: "operational",
      required: true,
      purpose: "Past support tickets and resolutions"
    },
    {
      spaceId: "external-docs",
      category: "external",
      required: false,
      purpose: "Third-party integration documentation"
    }
  ]
}

Step 2: Tenant configures sources

// KnowledgeSourceConfig (per-tenant)
[
  {
    id: "src_notion_product_docs",
    tenantId: "acme-corp",
    spaceId: "product-canon",
    kind: "notion",
    location: "https://notion.so/acme/product-docs",
    syncPolicy: { interval: "1h" },
    lastSyncedAt: "2025-01-15T10:00:00Z"
  },
  {
    id: "src_gmail_support_threads",
    tenantId: "acme-corp",
    spaceId: "support-history",
    kind: "gmail",
    location: "support@acme.com",
    syncPolicy: { webhook: true },
    lastSyncedAt: "2025-01-15T10:30:00Z"
  },
  {
    id: "src_stripe_docs",
    tenantId: "acme-corp",
    spaceId: "external-docs",
    kind: "url",
    location: "https://stripe.com/docs",
    syncPolicy: { interval: "24h" },
    lastSyncedAt: "2025-01-15T00:00:00Z"
  }
]

Step 3: TenantAppConfig binds spaces

// TenantAppConfig
{
  tenantId: "acme-corp",
  blueprintId: "support-app",
  blueprintVersion: "1.0.0",
  knowledgeBindings: [
    {
      spaceId: "product-canon",
      enabled: true,
      allowedConsumers: {
        workflowIds: ["answer-question", "generate-docs"],
        agentIds: ["support-agent", "docs-agent"]
      },
      allowedCategories: ["canonical"],
      sources: ["src_notion_product_docs"]
    },
    {
      spaceId: "support-history",
      enabled: true,
      allowedConsumers: {
        workflowIds: ["answer-question", "escalate-ticket"],
        agentIds: ["support-agent"]
      },
      allowedCategories: ["operational"],
      sources: ["src_gmail_support_threads"]
    },
    {
      spaceId: "external-docs",
      enabled: true,
      allowedConsumers: {
        agentIds: ["support-agent"]
      },
      allowedCategories: ["external"],
      sources: ["src_stripe_docs"]
    }
  ]
}

Step 4: Workflow uses knowledge

// WorkflowSpec
workflowId: answer-question
version: '1.0.0'.0.0

steps:
  - id: generate-embedding
    capability: openai-embeddings
    inputs:
      text: ${input.question}
  
  - id: search-canonical
    capability: vector.search
    inputs:
      collection: "product-canon"
      vector: ${steps.generate-embedding.output.embedding}
      limit: 5
  
  - id: search-support-history
    capability: vector.search
    inputs:
      collection: "support-history"
      vector: ${steps.generate-embedding.output.embedding}
      limit: 3
  
  - id: generate-answer
    capability: openai-chat
    inputs:
      messages:
        - role: "system"
          content: |
            You are a support agent. Answer based on:
            1. Canonical docs (authoritative)
            2. Support history (helpful context)
            Only use external docs for integration questions.
        - role: "user"
          content: |
            Question: ${input.question}
            
            Canonical docs:
            ${steps.search-canonical.output.results}
            
            Support history:
            ${steps.search-support-history.output.results}

Category-based access control

Different knowledge categories have different trust levels and access patterns:

CategoryTrust LevelUse CasesPolicy Impact
canonicalHighestProduct specs, schemas, official policiesCan drive policy decisions
operationalHighSupport tickets, sales docs, internal runbooksCan inform decisions
externalMediumThird-party docs, regulations, PSP guidesReference only, not authoritative
ephemeralLowAgent scratchpads, session context, draftsNever used for decisions

Multi-space workflows

Workflows can query multiple knowledge spaces and combine results:

knowledgeBindings: [
  {
    spaceId: "product-canon",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["invoice-generation", "quote-creation"]
    },
    allowedCategories: ["canonical"],
    sources: ["src_database_schema", "src_product_catalog"]
  },
  {
    spaceId: "pricing-rules",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["invoice-generation", "quote-creation"]
    },
    allowedCategories: ["canonical", "operational"],
    sources: ["src_pricing_database", "src_discount_policies"]
  },
  {
    spaceId: "customer-history",
    enabled: true,
    allowedConsumers: {
      workflowIds: ["quote-creation"]
    },
    allowedCategories: ["operational"],
    sources: ["src_crm_data", "src_past_invoices"]
  }
]

Security & validation

  • Knowledge sources are validated before sync - credentials and permissions checked
  • PDP enforces which workflows/agents can access which spaces
  • All knowledge queries are audited with search terms and results
  • Canonical knowledge is immutable once indexed - changes require re-sync
  • Ephemeral knowledge is automatically purged based on retention policies

Best practices

  • Use canonical spaces for policy-critical decisions, operational for suggestions
  • Never allow workflows to write to canonical spaces - maintain read-only access
  • Set up monitoring for sync failures and stale knowledge sources
  • Document the purpose and trust level of each knowledge space
  • Test knowledge queries in sandbox before promoting to production
  • Use explicit allowedConsumers - avoid wildcard access