A standing product commitment. This document is public at dacard.ai/commitments/no-vanity-metrics.


The commitment

Every number Dacard displays to a user will be attached to a specific, measurable outcome it predicts, or it will not be displayed at all.

A number that cannot answer the question "so what does this let me do differently" has no place on screen. A number that looks impressive but predicts nothing actively misleads. We treat both as product bugs.

This is a product boundary, not a roadmap item. It will not move.


What we will never build

We will not build, ship, enable via API, or allow as a customer-configured feature:

  • Hero counters of activity volume. No "signals processed," "events analyzed," "teams scored," "recommendations generated," or equivalent totals, displayed without context or action.
  • Composite scores as the primary surface in any view. The dScore exists as an anchor for conversation, not as a headline number. It never appears alone.
  • Leaderboards, benchmark rankings, or "top X" lists presented as intrinsically meaningful without the T+90d outcome data that would make them falsifiable.
  • Trending-up arrows, percentage-change badges, or velocity indicators for metrics where movement does not correspond to an outcome we can verify.
  • Achievement badges, streaks, or gamified counters tied to usage volume rather than to outcome improvement.
  • "Insights delivered" or "questions answered" counters. The unit of output is behavior change, not conversation volume.
  • Engagement metrics in product analytics views surfaced to customers. Time-in-app, sessions, queries, and click counts tell us nothing about whether the product worked.
  • Third-party-style growth charts (weekly active teams, monthly enrollment) shown to customers as evidence of Dacard's value. Our growth is our problem, not theirs.
  • Any number whose primary function is to make the product feel busier or more consequential than the outcome data supports.

If a customer asks for any of the above, the answer is no. The answer stays no under commercial pressure. The answer stays no if a larger competitor ships it first.


Why

Goodhart's law applies to the display layer too. The moment a team sees a vanity number climb, they start optimizing it, even unconsciously. The number stops measuring anything real and starts measuring the optimization effort. Every vanity surface we ship is a small calibration instrument we have just broken.

Vanity metrics launder uncertainty. A big confident number ("47,000 signals analyzed this week") obscures how few of those signals were actually predictive. This is the exact failure mode Cutler flags as "false confidence through flattening." We are, at the root, an instrument vendor. Instrument vendors who inflate their readings get bought once and never again.

The display contract is the product contract. What we choose to render is what we are implicitly telling customers matters. A dashboard full of velocity counters tells them velocity matters. A dashboard anchored on per-dimension calibration tells them calibration matters. We choose the second, and we hold the line on the first.

The skeptics are right about this, across eras. Tufte on chartjunk. Cutler on RAG-status theatre. Cagan on output-over-outcome. Torres on happy-ears metrics. Every careful voice in our neighborhood has warned against numbers-for-the-sake-of-numbers. We would rather look quieter than our competitors and be load-bearing, than look busy and be hollow.

The data does not support most of what we would be tempted to display. Our own /science page publishes the dimensions where predictive power is weak. Surfacing those as confident headline numbers would contradict the commitment we made there. The two commitments reinforce each other.


What we build instead

  • Per-dimension profiles with confidence intervals. Every score carries its evidence count and interval. Users see both the estimate and how certain we are, in the same visual unit.
  • Change that corresponds to a predicted outcome. When a metric moves, we show what outcome that movement is predicting (or that the movement is too small to predict anything). "This dimension moved +0.4 stages; our calibration data says this typically precedes a 12% reduction in cycle time over 90 days, n=47."
  • The anomaly feed. Raw operational reality, not aggregated theatre. Things that are unusual get surfaced; things that are normal do not.
  • Evidence ledgers. Instead of counting how many signals we processed, we show which specific signals contributed to a given score, how much each mattered, and how that weight was learned.
  • Negative results, visibly. When a dimension is not predicting anything, we say so in the primitive itself, not only on /science. "This dimension is currently uncalibrated (n=8, CI too wide to act on). Treat as directional only."
  • Outcome-anchored summaries. Any summary metric we do publish is defined in terms of the outcome it's meant to correspond to, with the correspondence itself falsifiable. No free-floating composites.

How you can hold us to this

1. The primitive kit enforces it. Every primitive in the kit (ScoreCard, RankedList, ComparisonPanel, and the rest) has a required outcome or evidence prop. The primitive does not render without one. You cannot ship a Dacard number without naming what outcome it corresponds to. This is code-level enforcement, not policy.

2. Design review gates it. Any new user-facing surface must pass the "so what test" before ship: for every number on the surface, a reviewer must be able to complete the sentence "this number matters because it lets the user do X differently." If the sentence does not complete cleanly, the number comes off or gets reframed.

3. The commitment is public. This document lives at dacard.ai/commitments/no-vanity-metrics and in the repo. If we ever change it, we change it publicly, with reasoning, and we give customers 90 days to leave with their data.

4. We publish violations. Every quarter we review shipped surfaces and publish, in our transparency note, any number that slipped through and failed the so-what test. We name the surface, the number, and the fix. If we ever find ourselves with nothing to publish, that is a sign we have stopped looking hard enough, and we escalate the audit standard.


What this commitment does not prevent

  • Diagnostic numbers for our own team. Internal dashboards for engineering, customer success, and research can use whatever numbers help us do our jobs. The commitment is about what we display to customers, not what we use to run the company.
  • Transparency metrics that make us worse-looking, not better-looking. If a number shows a limit (small sample size, wide confidence interval, failing calibration), surfacing it to users is explicitly fine. The commitment rules out numbers that flatter us, not numbers that constrain us.
  • Progress indicators for in-flight tasks. A sync status, a crawl-in-progress spinner, a "3 of 24 dimensions still computing" progress bar. These are honest UX, not vanity.
  • Cost and consumption metrics. Credits used, credits remaining, and cost-per-score are operational and the customer needs them to manage spend.
  • Cohort benchmarks with context. Percentile bands against anonymized cohorts of similar teams, with sample sizes and limits disclosed, are allowed. These are not rankings (see no-ranking.md), and they are not vanity; they contextualize a reading.

The line is clear: we display numbers that help a user decide something, and we do not display numbers that help Dacard look busy.


Signed on behalf of Dacard, 2026-04-24.* *Revisions, if any, will be logged here with date, reason, and 90-day notice to customers.

No vanity metrics | DAC Commitments | DAC