System of record + effectiveness monitor

Mission control for agentic assets across AI coding harnesses.

See what exists, what is governed, what runs, what earns its place, and what needs action, across every harness. The unit of analysis is the reusable agentic asset itself, not the run or the span.

Try the demo How telemetry works Public repo coming soon

Claude Code, Codex, Copilot, Grok Demo at zero Fabric cost Live on Microsoft Fabric

Monitor / asset signal

skill: seed-data-generator full

subagent: model-builder partial

mcp: docs-server derived

prompt: release-notes none

Illustrative panel - no real metrics

The problem / why now

Nothing measures the asset, only the application around it.

Tracing and observability tooling each watches a single application. The reusable asset that moves between harnesses falls through the gap. That gap is the wedge.

Single-app blind spot

No cross-harness view

No trace tool answers how one skill performs in Claude Code vs Codex vs Copilot vs Grok, because each one watches a single application.

Wrong unit of analysis

Runs and spans, not assets

LLM observability tools like LangSmith and Langfuse instrument the runs and spans of an app you built. The Vault's unit of analysis is the reusable agentic asset itself.

Distribution-only rivals

Catalogs without effectiveness

The competitive set is nascent skill catalogs that do distribution only, with nothing on effectiveness. That whitespace is the wedge.

Trace observability vs The Living Vault
Dimension	Trace observability	The Living Vault
Unit of analysis	Run, span, trace of one application	Reusable skill, subagent, MCP server, prompt
Scope	Single app you built and instrumented	Cross-harness: Claude Code, Codex, Copilot, Grok
Registry	No editable system of record for assets	Governed catalog you can add, edit, and retire
Instrumentation honesty	Assumes full trace coverage	Per-metric fidelity grades; gaps shown, never faked

Operational promise

Seven questions, answered at a glance.

The Vault is an operations monitor, not a passive catalogue. Every surface leads with status and action.

What exists?

Catalogue, Overview

What is governed?

Coverage, Overview

What is used?

Effectiveness, Monitor

What is effective?

Effectiveness, Cost

What is stale or duplicated?

Coverage, Monitor queue

What depends on what?

Dependencies, Model

What needs action?

Monitor exception queue: triage stale, ungoverned, and orphaned usage before it becomes drift.

Positioning

Three axes: govern, measure, improve.

One discipline expressed on three axes. Each axis is a working surface, not a slogan.

Govern

The editable system of record. Every asset has a definition, an owner, and a state you can change in place.

Add, edit, retire in the Catalogue

Measure

The cross-harness Monitor. One asset, watched across every harness it runs in, on a single pane.

HarnessSummaryStrip + per-harness usage panels

Improve

The effectiveness signal that says which assets earn their place, and which should be retired.

No telemetry shown as a gap, not a measured zero

North star surfaces

The Monitor and the Model canvas are the product.

Status before structure. Risk before decoration. Action before raw data. These two surfaces carry that priority order.

Monitor

Cross-harness operational monitor

The Monitor is mission control: a data-health strip, a triage-able exception queue, per-asset topology with named lenses, and usage panels for all four harnesses on one pane.

HarnessSummaryStrip compares Claude, Codex, Copilot, Grok
Exception queue surfaces stale, ungoverned, and orphaned usage
Each metric carries an honesty grade for its harness

The Living Vault Monitor showing cross-harness health strip, exception queue, and asset topology graph on a live Fabric deploy — Monitor on live Fabric (2026-06-19)

Model canvas

Schema view with operational signal

The /model canvas is not a static ER diagram. Table cards carry governed and effective rates, error rates, staleness badges, and live usage sparklines. Edges are weighted by link counts; a time scrubber moves signal through the utilization window.

Search flies the canvas to a matching table
Hot dependency paths read louder than cold ones
Read-only on canvas; writes live in the Catalogue

The /model canvas on live Fabric with Asset hub relationships, Utilization facts, and health overlay — /model canvas on live Fabric (2026-06-19)

Live Fabric proof

From harness logs to Fabric SQL to the ops UI.

The public demo runs on synthetic data at zero Fabric cost. A live Rayfin deploy on Microsoft Fabric proves the same semantic model, utilization facts, and Monitor surfaces on real infrastructure.

Fabric SQL

Telemetry lands in the Utilizations table behind effectiveness metrics.

Monitor

North-star ops surface: harness strip, exception queue, topology.

Overview

Executive quadrant and what-to-fix-next triage on the same model.

Power BI

Optional downstream analytics on the Rayfin semantic model, not a substitute for the Monitor.

How Library, Fabric, governance, and Power BI connect

What it is

A Fabric-native record and monitor for agentic assets.

Editable system of record and effectiveness monitor for agentic assets, with each asset carrying a live operational state.
Built on the Microsoft Rayfin SDK for Microsoft Fabric, with a React plus Vite dashboard.
Two runtime paths: a zero-cost demo on synthetic seed data, and a live Fabric deploy with Rayfin read and write on the same semantic model.

Skills Subagents Commands MCP servers Workflows Prompts Hooks

Platform

Microsoft Rayfin SDK for Fabric

TypeScript entity decorators define the data model; React plus Vite renders it.

Demo vs live

Zero-cost demo

The public SWA runs on synthetic data with demo auth. Live Fabric hosting with SSO and MSSQL is deployed and exercised on the same nine-entity model.

Unit of analysis

The reusable asset

Not the run, not the span, not the application. The asset, wherever it runs.

Northstar principle

An operations monitor, not a passive catalogue.

It is an operations monitor, not a passive catalogue or static schema viewer. The priority order is the same on every surface.

PRIORITY 01

Status before structure

PRIORITY 02

Risk before decoration

PRIORITY 03

Action before raw data

Every asset has a visible operational state. Live data drives the state; static metadata only explains what that state means.

Lifecycle loop

The structural advantage is owning the whole loop.

Distribution and effectiveness are usually separate products. The Vault closes the loop, and the closed loop is the moat.

Distribute the same assets across harnesses, project the catalog from the-fabric-library, then measure which assets earn their place, and loop back to distribute. Owning the whole lifecycle, install across harnesses then effectiveness in each, is the moat.

The Library

Distribution catalog. Installs the same agentic assets across harnesses and devices.

the-fabric-library

Upstream catalog projection source. Feeds fabric-library.json into the Vault catalog.

The Living Vault

System of record and effectiveness monitor. Measures which assets earn their place after install.

See the full ecosystem map: Library → Fabric → Power BI → governance

Honesty grade

Instrumentation is asymmetric, and shown as such.

Each harness carries a per-harness honesty grade so a reader always knows the fidelity behind a signal. The grades are full, partial, occupancy, derived, and none.

Honesty grade legend, from full instrumentation down to none.
Grade	What it means
full	Directly instrumented end to end. This is Claude Code today.
partial	Measured where the logs allow, with the gaps marked.
occupancy	Presence and activity are observable; full outcomes are not.
derived	Inferred from adjacent signals, and labeled as inference.
none	Not measurable today. Shown as a gap, never invented.

Claude Code is fully instrumented. The other harnesses are summarized at the fidelity their logs allow, each tagged with the grade that reflects its real coverage.

The product never fakes a number it cannot measure.

Grades are illustrative labels for fidelity, not scores or rankings.

Illustrative honesty grades by harness (not live data)
Harness	Typical grade	What you get
Claude Code	full	Per-asset invocations, successes, errors, latency from transcript metadata
Codex	partial	Session-level signals from rollout logs; gaps marked
Copilot	occupancy	Presence and activity observable; full outcomes not always available
Grok	partial	Session summaries at the fidelity local logs allow

Surfaces

Eight primary surfaces, plus two more.

Each surface leads with status and action. Together they cover the asset from inventory to dependency graph.

Primary nav (8)

Overview

Fleet status at a glance

Effective rate, quadrant counts, stale assets, telemetry freshness

Monitor

Cross-harness live state

Data-health strip, exception queue, asset topology, harness usage panels

Catalogue

The editable record

Add, edit, retire, and filter assets

Effectiveness

Which assets earn it

Usage and success per asset; unobserved assets show no telemetry

Cost

Spend per asset

Estimated from token usage at list prices, not metered billing

Coverage

Where assets reach

Pillar, domain, capability, maturity, duplicate analysis

Dependencies

What relies on what

Requires and required-by graph, bundle composition

Model

The /model canvas

Table-grain schema with operational signal and weighted edges

Deep routes and team workflow

Asset detail

A deep route

Full per-asset inspection outside primary nav

Settings

Configuration

Fabric connection, telemetry, and glossary

Submissions

Review queue

Contributors submit projections; maintainers accept or reject at /submissions

Privacy and telemetry

Metadata only. Your content stays local.

Collectors read structural metadata only: event types, tool and skill names, counts, durations, token totals, ids, and timestamps.
Never read: prompt text, model responses, command text, tool arguments, file contents, or auth tokens. Enforced in parser unit tests.
Real usage stays local: gitignored *.local.json snapshots read only by the dev server. Demo builds ship synthetic data only.
Four harness sources: Claude Code transcripts, Codex rollouts, Grok session logs, Copilot event streams.

Local-first means harness logs are read on your machine, not inside Fabric. You choose when to push metadata-only snapshots into a live deploy. The product names that tension rather than hiding it.

Full field exclusion list: Telemetry and privacy guide

How telemetry works

Harness logs in. Metadata out. Your content never leaves.

The public demo runs on synthetic effectiveness data. Live Fabric can ingest real Claude Code usage today: curl the standalone collector from the deployed app, run it on your Mac, and upload the metadata-only JSON in Settings. No private repo checkout required.

Harness logs

Claude, Codex, Grok, Copilot write local session files

Collectors

Metadata-only parsers; no prompts or responses

Local export

./utilization.upload.json on your disk until you import

Settings upload

Reconciles rows to catalog asset ids in Fabric

Monitor

Effectiveness, Cost, harness panels with honesty grades

What the public demo shows

Synthetic utilization and harness-usage JSON baked into the build
Demo auth; zero Fabric cost; no access to your machine
Full UI loop: Monitor, Effectiveness, Cost, Model canvas

What live Fabric adds today

Standalone collector at /demo/living-vault-collect.mjs (curl, no repo checkout)
node living-vault-collect.mjs telemetry writes ./utilization.upload.json
Settings telemetry wizard uploads into the Utilization table
Unmatched assets are reported, not silently dropped or faked to zero

Try the demo now; read the full telemetry guide for per-harness sources and excluded fields. No repository access required.

Team governance

Submit for review before assets enter the catalog.

Contributors upload a scanned projection JSON. Maintainers triage in /submissions. Accepted items flow through the same importProjection path as admin sync. Pending submissions stay outside the live catalog until accepted.

STEP 01

Scan locally

A local CLI step produces a projection JSON from the contributor's repo. The browser app never scans disk.

STEP 02

Submit for review

Upload lands in a submission store separate from VaultDataset. Dedup verdicts: new, duplicate, or conflict.

STEP 03

Maintainer triage

Role-gated /submissions page: accept, request changes, or reject per item or in bulk.

STEP 04

Accept into catalog

Accepted rows run through importProjection and runDataSourceContract, the same gate as admin sync.

Review queue

Maintainers gate what enters the catalog

Contributors submit a scanned projection JSON. Pending items stay outside VaultDataset until a maintainer accepts. Accepted rows run through the same importProjection and contract gate as admin sync.

Submissions review queue with pending item and accept flow on live Fabric — /submissions on live Fabric (2026-06-19)

Honest limitation

Local-first collection; push to Fabric when you choose.

The demo runs synthetic data at zero Fabric cost. Live Fabric already accepts real Claude Code usage: download the collector from the deployed app, run it locally, upload the metadata-only JSON in Settings. Fabric never reads your home directory. Org-wide automatic ingestion (every contributor, attributed rollups, RBAC) is still on the roadmap.

Privacy by design

Collectors read local harness logs. Real usage never ships inside a static demo build until you explicitly upload it.

Demo vs live Fabric

The SWA demo uses seed data at zero Fabric cost. Live Fabric read and write is deployed; per-contributor RBAC and org rollups remain gated work.

Telemetry ingestion

Harness logs stay on local disk. The deployed app serves a standalone collector and accepts Settings uploads into Fabric. Team-wide scheduled sync and per-contributor attribution remain future work.

Try the demo. No repo access required.

The interactive demo runs on synthetic data at zero Fabric cost. Live Fabric deploys accept real telemetry through the in-app collector and upload wizard.

Try the demo How telemetry works Ecosystem map