System of record + effectiveness monitor

Mission control for agentic assets across AI coding harnesses.

See what exists, what is governed, what runs, what earns its place, and what needs action, across every harness. The unit of analysis is the reusable agentic asset itself, not the run or the span.

Try the demo How telemetry works Public repo coming soon
Claude Code, Codex, Copilot, Grok Demo at zero Fabric cost Live on Microsoft Fabric
The problem / why now

Nothing measures the asset, only the application around it.

Tracing and observability tooling each watches a single application. The reusable asset that moves between harnesses falls through the gap. That gap is the wedge.

Single-app blind spot

No cross-harness view

No trace tool answers how one skill performs in Claude Code vs Codex vs Copilot vs Grok, because each one watches a single application.

Wrong unit of analysis

Runs and spans, not assets

LLM observability tools like LangSmith and Langfuse instrument the runs and spans of an app you built. The Vault's unit of analysis is the reusable agentic asset itself.

Distribution-only rivals

Catalogs without effectiveness

The competitive set is nascent skill catalogs that do distribution only, with nothing on effectiveness. That whitespace is the wedge.

Trace observability vs The Living Vault
Dimension Trace observability The Living Vault
Unit of analysis Run, span, trace of one application Reusable skill, subagent, MCP server, prompt
Scope Single app you built and instrumented Cross-harness: Claude Code, Codex, Copilot, Grok
Registry No editable system of record for assets Governed catalog you can add, edit, and retire
Instrumentation honesty Assumes full trace coverage Per-metric fidelity grades; gaps shown, never faked
Operational promise

Seven questions, answered at a glance.

The Vault is an operations monitor, not a passive catalogue. Every surface leads with status and action.

01

What exists?

Catalogue, Overview

02

What is governed?

Coverage, Overview

03

What is used?

Effectiveness, Monitor

04

What is effective?

Effectiveness, Cost

05

What is stale or duplicated?

Coverage, Monitor queue

06

What depends on what?

Dependencies, Model

07

What needs action?

Monitor exception queue: triage stale, ungoverned, and orphaned usage before it becomes drift.

Positioning

Three axes: govern, measure, improve.

One discipline expressed on three axes. Each axis is a working surface, not a slogan.

01

Govern

The editable system of record. Every asset has a definition, an owner, and a state you can change in place.

Add, edit, retire in the Catalogue

02

Measure

The cross-harness Monitor. One asset, watched across every harness it runs in, on a single pane.

HarnessSummaryStrip + per-harness usage panels

03

Improve

The effectiveness signal that says which assets earn their place, and which should be retired.

No telemetry shown as a gap, not a measured zero

North star surfaces

The Monitor and the Model canvas are the product.

Status before structure. Risk before decoration. Action before raw data. These two surfaces carry that priority order.

Monitor

Cross-harness operational monitor

The Monitor is mission control: a data-health strip, a triage-able exception queue, per-asset topology with named lenses, and usage panels for all four harnesses on one pane.

  • HarnessSummaryStrip compares Claude, Codex, Copilot, Grok
  • Exception queue surfaces stale, ungoverned, and orphaned usage
  • Each metric carries an honesty grade for its harness
The Living Vault Monitor showing cross-harness health strip, exception queue, and asset topology graph on a live Fabric deploy
Monitor on live Fabric (2026-06-19)
Model canvas

Schema view with operational signal

The /model canvas is not a static ER diagram. Table cards carry governed and effective rates, error rates, staleness badges, and live usage sparklines. Edges are weighted by link counts; a time scrubber moves signal through the utilization window.

  • Search flies the canvas to a matching table
  • Hot dependency paths read louder than cold ones
  • Read-only on canvas; writes live in the Catalogue
The /model canvas on live Fabric with Asset hub relationships, Utilization facts, and health overlay
/model canvas on live Fabric (2026-06-19)
Live Fabric proof

From harness logs to Fabric SQL to the ops UI.

The public demo runs on synthetic data at zero Fabric cost. A live Rayfin deploy on Microsoft Fabric proves the same semantic model, utilization facts, and Monitor surfaces on real infrastructure.

Fabric SQL
Fabric SQL preview of the Utilizations table with invocation and token metadata

Telemetry lands in the Utilizations table behind effectiveness metrics.

Monitor
Cross-harness Monitor with health strip and topology graph

North-star ops surface: harness strip, exception queue, topology.

Overview
Overview page with Governed versus Effective quadrant and triage list

Executive quadrant and what-to-fix-next triage on the same model.

Power BI
Power BI overview report bound to the Fabric semantic model

Optional downstream analytics on the Rayfin semantic model, not a substitute for the Monitor.

How Library, Fabric, governance, and Power BI connect

What it is

A Fabric-native record and monitor for agentic assets.

  • Editable system of record and effectiveness monitor for agentic assets, with each asset carrying a live operational state.
  • Built on the Microsoft Rayfin SDK for Microsoft Fabric, with a React plus Vite dashboard.
  • Two runtime paths: a zero-cost demo on synthetic seed data, and a live Fabric deploy with Rayfin read and write on the same semantic model.
Skills Subagents Commands MCP servers Workflows Prompts Hooks
Platform
Microsoft Rayfin SDK for Fabric
TypeScript entity decorators define the data model; React plus Vite renders it.
Demo vs live
Zero-cost demo
The public SWA runs on synthetic data with demo auth. Live Fabric hosting with SSO and MSSQL is deployed and exercised on the same nine-entity model.
Unit of analysis
The reusable asset
Not the run, not the span, not the application. The asset, wherever it runs.
Northstar principle

An operations monitor, not a passive catalogue.

It is an operations monitor, not a passive catalogue or static schema viewer. The priority order is the same on every surface.

PRIORITY 01
Status before structure
PRIORITY 02
Risk before decoration
PRIORITY 03
Action before raw data

Every asset has a visible operational state. Live data drives the state; static metadata only explains what that state means.

Lifecycle loop

The structural advantage is owning the whole loop.

Distribution and effectiveness are usually separate products. The Vault closes the loop, and the closed loop is the moat.

LIFECYCLE Distribute across harnesses Project from fabric-library Measure earn their place
Distribute the same assets across harnesses, project the catalog from the-fabric-library, then measure which assets earn their place, and loop back to distribute. Owning the whole lifecycle, install across harnesses then effectiveness in each, is the moat.
The Library

Distribution catalog. Installs the same agentic assets across harnesses and devices.

the-fabric-library

Upstream catalog projection source. Feeds fabric-library.json into the Vault catalog.

The Living Vault

System of record and effectiveness monitor. Measures which assets earn their place after install.

See the full ecosystem map: Library → Fabric → Power BI → governance

Honesty grade

Instrumentation is asymmetric, and shown as such.

Each harness carries a per-harness honesty grade so a reader always knows the fidelity behind a signal. The grades are full, partial, occupancy, derived, and none.

Honesty grade legend, from full instrumentation down to none.
Grade What it means
full Directly instrumented end to end. This is Claude Code today.
partial Measured where the logs allow, with the gaps marked.
occupancy Presence and activity are observable; full outcomes are not.
derived Inferred from adjacent signals, and labeled as inference.
none Not measurable today. Shown as a gap, never invented.

Claude Code is fully instrumented. The other harnesses are summarized at the fidelity their logs allow, each tagged with the grade that reflects its real coverage.

The product never fakes a number it cannot measure.

Grades are illustrative labels for fidelity, not scores or rankings.

Illustrative honesty grades by harness (not live data)
Harness Typical grade What you get
Claude Code full Per-asset invocations, successes, errors, latency from transcript metadata
Codex partial Session-level signals from rollout logs; gaps marked
Copilot occupancy Presence and activity observable; full outcomes not always available
Grok partial Session summaries at the fidelity local logs allow
Surfaces

Eight primary surfaces, plus two more.

Each surface leads with status and action. Together they cover the asset from inventory to dependency graph.

Primary nav (8)

Overview
Fleet status at a glance
Effective rate, quadrant counts, stale assets, telemetry freshness
Monitor
Cross-harness live state
Data-health strip, exception queue, asset topology, harness usage panels
Catalogue
The editable record
Add, edit, retire, and filter assets
Effectiveness
Which assets earn it
Usage and success per asset; unobserved assets show no telemetry
Cost
Spend per asset
Estimated from token usage at list prices, not metered billing
Coverage
Where assets reach
Pillar, domain, capability, maturity, duplicate analysis
Dependencies
What relies on what
Requires and required-by graph, bundle composition
Model
The /model canvas
Table-grain schema with operational signal and weighted edges

Deep routes and team workflow

Asset detail
A deep route
Full per-asset inspection outside primary nav
Settings
Configuration
Fabric connection, telemetry, and glossary
Submissions
Review queue
Contributors submit projections; maintainers accept or reject at /submissions
Privacy and telemetry

Metadata only. Your content stays local.

  • Collectors read structural metadata only: event types, tool and skill names, counts, durations, token totals, ids, and timestamps.
  • Never read: prompt text, model responses, command text, tool arguments, file contents, or auth tokens. Enforced in parser unit tests.
  • Real usage stays local: gitignored *.local.json snapshots read only by the dev server. Demo builds ship synthetic data only.
  • Four harness sources: Claude Code transcripts, Codex rollouts, Grok session logs, Copilot event streams.

Local-first means harness logs are read on your machine, not inside Fabric. You choose when to push metadata-only snapshots into a live deploy. The product names that tension rather than hiding it.

Full field exclusion list: Telemetry and privacy guide

How telemetry works

Harness logs in. Metadata out. Your content never leaves.

The public demo runs on synthetic effectiveness data. Live Fabric can ingest real Claude Code usage today: curl the standalone collector from the deployed app, run it on your Mac, and upload the metadata-only JSON in Settings. No private repo checkout required.

Harness logs

Claude, Codex, Grok, Copilot write local session files

Collectors

Metadata-only parsers; no prompts or responses

Local export

./utilization.upload.json on your disk until you import

Settings upload

Reconciles rows to catalog asset ids in Fabric

Monitor

Effectiveness, Cost, harness panels with honesty grades

What the public demo shows

  • Synthetic utilization and harness-usage JSON baked into the build
  • Demo auth; zero Fabric cost; no access to your machine
  • Full UI loop: Monitor, Effectiveness, Cost, Model canvas

What live Fabric adds today

  • Standalone collector at /demo/living-vault-collect.mjs (curl, no repo checkout)
  • node living-vault-collect.mjs telemetry writes ./utilization.upload.json
  • Settings telemetry wizard uploads into the Utilization table
  • Unmatched assets are reported, not silently dropped or faked to zero

Try the demo now; read the full telemetry guide for per-harness sources and excluded fields. No repository access required.

Team governance

Submit for review before assets enter the catalog.

Contributors upload a scanned projection JSON. Maintainers triage in /submissions. Accepted items flow through the same importProjection path as admin sync. Pending submissions stay outside the live catalog until accepted.

STEP 01

Scan locally

A local CLI step produces a projection JSON from the contributor's repo. The browser app never scans disk.

STEP 02

Submit for review

Upload lands in a submission store separate from VaultDataset. Dedup verdicts: new, duplicate, or conflict.

STEP 03

Maintainer triage

Role-gated /submissions page: accept, request changes, or reject per item or in bulk.

STEP 04

Accept into catalog

Accepted rows run through importProjection and runDataSourceContract, the same gate as admin sync.

Review queue

Maintainers gate what enters the catalog

Contributors submit a scanned projection JSON. Pending items stay outside VaultDataset until a maintainer accepts. Accepted rows run through the same importProjection and contract gate as admin sync.

Submissions review queue with pending item and accept flow on live Fabric
/submissions on live Fabric (2026-06-19)
Honest limitation

Local-first collection; push to Fabric when you choose.

The demo runs synthetic data at zero Fabric cost. Live Fabric already accepts real Claude Code usage: download the collector from the deployed app, run it locally, upload the metadata-only JSON in Settings. Fabric never reads your home directory. Org-wide automatic ingestion (every contributor, attributed rollups, RBAC) is still on the roadmap.

Privacy by design

Collectors read local harness logs. Real usage never ships inside a static demo build until you explicitly upload it.

Demo vs live Fabric

The SWA demo uses seed data at zero Fabric cost. Live Fabric read and write is deployed; per-contributor RBAC and org rollups remain gated work.

Telemetry ingestion

Harness logs stay on local disk. The deployed app serves a standalone collector and accepts Settings uploads into Fabric. Team-wide scheduled sync and per-contributor attribution remain future work.

Try the demo. No repo access required.

The interactive demo runs on synthetic data at zero Fabric cost. Live Fabric deploys accept real telemetry through the in-app collector and upload wizard.