OSS-first docs
These docs teach the open system first: contracts, generated surfaces, runtimes, governance, and incremental adoption. Studio shows up as the operating layer on top, not as the source of truth.
Observability and Cost Evidence
ContractSpec routes traces, metrics, structured logs, and economic (cost and usage) evidence through a single blessed spine. One facade emits everything; signals are declared once as contracts; durable economic evidence flows through an append-log so it can be replayed and billed without recomputation.
The single spine
All observability flows through the @lssm-tech/lib.observability package. There is exactly one way to emit each kind of signal — historical rival span helpers (per-app withSpan / withSpanAsync wrappers and ad-hoc metric counters) were removed in favour of the facade, and a lint rule blocks their reintroduction.
Traces & metrics
Spans and measurements flow to OpenTelemetry.
Cost & usage
Economic records flow to a durable evidence port (append-log / outbox) that feeds both observability and billing.
Audit lines
Decisions and state changes flow to the structured logger.
Instrumentation facade
Construct one facade and emit through its methods. It is no-op safe: until an OpenTelemetry SDK is started and an evidence port is bound, it degrades quietly (and warns once) rather than throwing.
import { createInstrumentation } from '@lssm-tech/lib.observability';
const instrumentation = createInstrumentation({
meterName: '@my-org/service',
serviceName: 'my-service',
// evidencePort bound at the app composition root (see "Evidence bus")
});
// trace — wrap an async operation in a span named after the signal
await instrumentation.trace('agent.run', async (span) => {
span.setAttribute('tenant_ref', tenantRef);
return runAgent();
});
// meter — duration/size units route to a histogram, everything else to a counter
instrumentation.meter(SOME_SIGNAL, durationMs, { surface_id: 'cockpit' });
// cost / usage — durable economic evidence through the bound port
await instrumentation.cost(evidenceRecord);
await instrumentation.usage(evidenceRecord);
// audit — structured log line for decisions and state changes
instrumentation.audit('budget.policy.evaluated', { verdict: 'allow' });In client- or SSR-reachable code, import the facade from its subpath @lssm-tech/lib.observability/facade/instrumentation rather than the package root — the root barrel re-exports the Node-only OpenTelemetry SDK bootstrap, which must not be dragged into a browser bundle.
Signals are contracts
Every metric, trace, cost, or usage signal is declared once as a canonical, locale-free SignalSpec via defineSignal, then referenced everywhere by its id. Ids are machine contracts: lowercase, dotted, and never carrying a locale suffix (only labels and aliases vary per locale).
import { defineSignal } from '@lssm-tech/lib.contracts-spec';
export const LLM_TOKENS = defineSignal({
id: 'signal.llm.tokens', // canonical, locale-free
kind: 'usage',
unit: 'count',
dimensions: ['provider.model'],
});
// The facade routes by unit: ms/s/bytes -> histogram, otherwise -> counter.
instrumentation.meter(LLM_TOKENS, tokenCount, { 'provider.model': 'opus' });Tracing (OpenTelemetry)
Spans use OpenTelemetry under the hood. The SDK bootstrap is explicit and idempotent; when no exporter endpoint is configured it falls back to a console / no-op tracer (never silently dropped). Configure exporters via standard environment variables — see the Distributed tracing guide at /docs/ops/distributed-tracing for collector and exporter setup.
import {
startObservabilitySdk,
shutdownObservabilitySdk,
} from '@lssm-tech/lib.observability/sdk/bootstrap';
await startObservabilitySdk(); // NodeSDK + OTLP; console fallback if unset
// ... run the service ...
await shutdownObservabilitySdk();Structured logs
Logs are structured and correlated, never freeform console output. Carry a correlation id and, where permitted, tenant/actor context — but never secrets or PII, and never a raw tenant_id (use a hashed tenant_ref). Choose levels intentionally: info for state changes, warn for recoverable conditions, error when action is needed.
The evidence bus (append-log)
Cost and usage evidence is durable. Instead of a synchronous dual-write, the producing request appends an EvidenceRecord to an outbox inside the same tenant-scoped transaction as the business mutation (append-log as source of truth). A worker relay drains the outbox, projects each record into an idempotent ledger, and emits OTLP — at-least-once, with a unique idempotency key preventing double-counting.
// Producer (api) — evidence commits atomically with the mutation
await withTenantTx(pool, tenantId, async (txQuery) => {
await writeDomainMutation(txQuery);
await bindEvidencePort(txQuery).appendEvidence(record); // same transaction
});
// Relay (worker) — drain -> idempotent project -> mark relayed -> OTLP
// A crash after projecting re-drains the row; projection is a no-op the
// second time, so totals never double-count.Tenant isolation
Outbox and ledger enforce row-level security on a raw tenant_id key — never hashed, because it is the partition key.
Atomic
If the producing mutation rolls back, no orphan evidence is left behind.
Live read seam
Replay, audit-trail views, and approval/anomaly triggers read live projected evidence through an EvidenceLedgerReadPort instead of recomputing it.
Agents self-meter by default
Agent runs are instrumented out of the box. Locale resolution, system prompt size, retrieval stage hits, and run economics flow through the facade automatically — supply your own hooks to override, or pass disableDefaultTelemetry to opt out (for example in tests that assert silence).
import { createInstrumentation } from '@lssm-tech/lib.observability';
// Default-on: telemetry flows automatically.
const agent = await ContractSpecAgent.create({ spec });
// Or bind your own instrumentation / opt out:
const agent2 = await ContractSpecAgent.create({
spec,
instrumentation: createInstrumentation({ evidencePort }),
});
const agent3 = await ContractSpecAgent.create({
spec,
disableDefaultTelemetry: true,
});Product analytics (PostHog)
PostHog event names, custom property names, and per-surface $lib values live in a single source of truth and are referenced by key — never inlined as string literals. Renaming a wire value breaks dashboards, so values are added, never repurposed, and a lock test pins them to their historical strings.
Operating checklist
Emit through the facade — never reintroduce ad-hoc span or metric helpers.
Declare every signal with defineSignal; reference it by canonical id.
Bind the evidence port at the app composition root, inside the tenant transaction.
Set OTEL_EXPORTER_OTLP_ENDPOINT per environment; keep keys in env, not code.
Never log secrets, PII, or a raw tenant_id; correlate with ids and hashed refs.
In client/SSR code, import the facade from /facade/instrumentation, not the root barrel.
Workflow monitoring
Observe multi-step execution with enough context to understand failures and regressions.
Distributed tracing
Trace contract execution across integrations, workflows, and generated surfaces.
Why ContractSpec
Keep educational and comparison content reachable without letting it define the primary OSS learning path.