OSS-first docs

These docs teach the open system first: contracts, generated surfaces, runtimes, governance, and incremental adoption. Studio shows up as the operating layer on top, not as the source of truth.

provider-ranking.benchmark.run-custom

Launch a custom benchmark evaluation against a specific model.

  • Type: operation (command)
  • Version: 1.0.0
  • Tags: custom, eval
  • File: packages/libs/contracts-spec/src/provider-ranking/commands/benchmarkRunCustom.command.ts
  • field.key.label
    provider-ranking.benchmark.run-custom
    field.version.label
    1.0.0
    field.type.label
    operation (command)
    field.title.label
    provider-ranking.benchmark.run-custom
    field.description.label

    Launch a custom benchmark evaluation against a specific model.

  • Type: operation (command)
  • Version: 1.0.0
  • Tags: custom, eval
  • File: packages/libs/contracts-spec/src/provider-ranking/commands/benchmarkRunCustom.command.ts
  • field.tags.label
    custom,eval
    field.owners.label
    field.stability.label

    Launch a custom benchmark evaluation against a specific model.

    Goal

    Evaluate model performance using internal eval suites.

    Context

    Used by operators to run proprietary benchmarks and compare models.

    Source Definition

    export const BenchmarkRunCustomCommand = defineCommand({
      meta: {
        key: 'provider-ranking.benchmark.run-custom',
        title: 'Run Custom Benchmark',
        version: '1.0.0',
        description:
          'Launch a custom benchmark evaluation against a specific model.',
        goal: 'Evaluate model performance using internal eval suites.',
        context:
          'Used by operators to run proprietary benchmarks and compare models.',
        domain: PROVIDER_RANKING_DOMAIN,
        owners: PROVIDER_RANKING_OWNERS,
        tags: [...PROVIDER_RANKING_TAGS, 'custom', 'eval'],
        stability: PROVIDER_RANKING_STABILITY,
        docId: [docId('docs.tech.provider-ranking.benchmark.run-custom')],
      },
      capability: {
        key: 'provider-ranking.system',
        version: '1.0.0',
      },
      io: {
        input: BenchmarkRunCustomInput,
        output: BenchmarkRunCustomOutput,
      },
      policy: {
        auth: 'user',
        pii: [],
      },
      sideEffects: {
        emits: [
          {
            ref: BenchmarkCustomCompletedEvent.meta,
            when: 'Custom benchmark evaluation finishes execution.',
          },
        ],
      },
    });