Technical Appendix & Methodology | Investor Materials

01 How to read this document

This appendix is a reference manual, not a pitch. It defines the vocabulary, the three frameworks and 88 dimensions, the composite stage ladder, the named pattern library, the substrate the product fuses, the signal taxonomy, the scoring pipeline, the observability contract, the eval framework, the pattern-discovery engine, the AI economics, and the tech stack. It is intended for technical due diligence and for engineers, product leaders, and operators who want to validate the methodology behind the diagnostic.

Voice is declarative and neutral throughout. Where the memo argues, this document specifies. Where the memo claims, this document cites file paths and counts. Anything that is opinion has been pushed to the memo and the strategy documents. Anything that is fact lives here.

02 Vocabulary reference

Terms used across the product, the codebase, and the investor materials. The right column is the canonical definition. Where a term has been deprecated, the deprecation is noted in the relevant section, not here.

Term	Definition
Diagnostic	The full Dacard scoring run across all three frameworks for a single product. Combines URL crawl, adapter signals, and tribal-knowledge inputs into per-dimension scores, composite stage, and ranked actions.
Score	A 0 to 100 number on a single dimension, framework, or composite. Calibrated against archetype-conditioned benchmarks. Lower is earlier maturity, higher is more compounding.
Dimension	One of the 88 atomic measurement units. Each dimension belongs to exactly one framework. Each has its own rubric, signal mapping, and stage ladder.
Framework	One of three measurement systems: Team Operations, Development Lifecycle, Product Assessment. Each framework groups its dimensions into functions or phases and rolls them into a per-framework stage.
Pattern	A named cross-framework tension. Detected when specific dimensions across two or more frameworks fire in a defined configuration. Three are live: Translation Gap, Fragility Signal, Compound Ready. Patterns are the surface vocabulary; dimensions are the substrate.
Archetype	A team shape used to condition scoring weights and ranked-action priorities. Reflects company stage, function mix, and operating model. Archetype is detected automatically and adjustable.
Stage (composite)	The cross-framework five-stage verb ladder. React, Augment, Orchestrate, Lead, Compound. Used in board summaries and dashboard headers to collapse all three frameworks into one read.
Stage (per-framework)	The framework-specific maturity ladder, calibrated to the function the framework measures. Each framework has its own ladder and its own canonical names.
Substrate	The data layer beneath the diagnostic. Three substrates fuse: operational adapters (54 providers), public web (scraped competitor and peer signals), tribal knowledge (Teach DAC capture, Slack channel ingestion).
Integration	An adapter that pulls signals from a third-party tool. 54 providers in 12 categories. Each implements the ProviderAdapter interface and normalizes to the shared signal taxonomy.
Signal	A typed observation extracted from an integration, the public web, or a tribal-knowledge capture. Approximately 178 predefined signal types in the codebase. Signals are the atomic input; dimensions are the atomic output.
Ranked action	A prioritized recommendation with archetype-conditioned LNO classification. Tells the customer what to fix first given the pattern firing. Pushed into Linear, Slack, and the agent fleets the team already runs.
Calibration	The process of testing dimension predictive validity against captured outcome data. Recalibrates archetype weights, retires invalidated dimensions, surfaces candidate patterns.
Observability trace	A row in the llm_traces table capturing one production LLM call. Links to recommendation feedback and score outcomes through trace_id. The single source of truth for prompt drift, judge drift, and outcome attribution.
Eval	An automated quality gate. The Dacard eval framework runs 16 categories and 75+ checks against scoring, coaching, and primitive output. Categories 15 and 16 gate coaching quality.
Judge	An LLM-as-judge call that scores coaching output against rubrics. Cached results live in evals/golden/judge-cache.json. Refreshed when the DAC system prompt changes.
Golden fixture	A canonical scoring scenario in packages/core/src/eval/golden-fixtures.ts. Each new primitive type requires at least one fixture exercising it.

03 The three frameworks

Three frameworks, 88 dimensions total. Each framework has its own dimension count, scope, stage ladder, and canonical sample dimensions. The three are designed to compose, not stack. Team Operations measures how the team works. Development Lifecycle measures how the team builds. Product Assessment measures what the team ships. The pattern library reads across the three to surface tensions that no single framework can see on its own.

Team Operations · TO

Team Operations (24 dimensions)

Measures cross-functional team effectiveness across six functions: product, engineering, design, data, operations, marketing.

Stage ladder: Foundation / Building / Scaling / Leading / Compounding. Calibrated to organizational maturity, not product maturity.

Sample dimensions: feedback loop quality, decision quality, role clarity, cross-function rituals, prioritization rigor, retro discipline, escalation cadence, OKR hygiene, planning horizon discipline, knowledge capture, onboarding ritual completeness.

Development Lifecycle · DL

Development Lifecycle (34 dimensions)

Measures the AI-native build cycle from intent through ship and back to learning.

Stage ladder: Specify / Context / Orchestrate / Validate / Ship / Compound. Six phases, each scored independently and rolled into a per-framework stage.

Sample dimensions: spec rigor, context-loading discipline, agent orchestration patterns, validation gate coverage, regression suite health, ship cadence, post-ship learning capture, eval cadence, prompt-drift monitoring, observability trace coverage, golden-fixture freshness.

Product Assessment · PA

Product Assessment (27 dimensions)

Measures the product itself: how AI-native the surface is, how the data substrate compounds, how the workflow integrates.

Stage ladder: Wrapper / Augmented / Integrated / Native / Compounding. Note: "Augmented" here is a per-framework stage name and does not conflict with the deprecated composite "Augmented" (which referred to the old composite ladder). Context disambiguates.

Sample dimensions: AI surface depth, data substrate compounding, workflow embeddedness, distribution mechanic, agent invocability, outcome instrumentation, retention loop coverage, expansion vector clarity, monetization fit.

Framework labels in tight UI contexts (rail rows, triad labels, table headers) collapse to Team / Operation / Product. Full names appear in prose. The TO / DL / PA abbreviations are reserved for diagrams, table cells, and parentheticals.

04 Composite stage ladder

The composite ladder collapses all three frameworks into one cross-framework summary. Used in board summaries, dashboard headers, /board-story output, and anywhere a single stage answers the question "where is this team overall." Verb forms only. Adjective forms (Reactive, Augmented, Operating, Leading, Compounding) are deprecated as composite labels.

Stage

Name

Color

Score band

React

red

0 to 40

Augment

gold

40 to 55

Orchestrate

gold

55 to 70

Lead

green

70 to 85

Compound

green-bright

85 to 100

The two-tier vocabulary is deliberate. Composite stages compress; per-framework stages preserve resolution. A team can be Lead overall while sitting at Specify in Development Lifecycle. The board summary uses the verb ladder. The framework drill-down uses the per-framework ladder. The same scoring data drives both.

05 The three-layer architecture

The product is one diagnostic on the surface, three layers of moat underneath. The architecture is documented end-to-end here so technical reviewers can verify the composition. Each layer has a defined defensibility argument; the layers compound, and replicating any one in isolation does not replicate the company.

Surface

Pattern Library

Pattern detection logic operates on the cross-framework score vector. Each pattern has a typed precondition (which dimensions fire and in which configuration) and a typed payload (the named pattern, the explanation, the suggested next moves). Patterns are not heuristics in prose; they are typed predicates over the 88 dimensions plus archetype context. Three patterns ship today: Translation Gap, Fragility Signal, Compound Ready. Library expansion is gated by the validation pipeline, not by editorial judgment.

Workflow

Ranked Actions

LNO classification (Leverage / Neutral / Overhead) plus archetype conditioning. Ranking is generated per-customer per-score, not assembled from a static catalog. Ranked actions are pushed to Linear (issue creation with full coaching context), Slack (digest pulses on cadence), and the agent fleets the customer already runs (via MCP server, agent skill, REST API). Every ranking decision is captured against the customer's outcome telemetry. Acceptance, rejection, and modification all feed the calibration engine.

Engine

Calibration Pipeline

Outcome data capture from connected stack signals (Linear, GitHub, Stripe, PostHog), predictive-validity testing on each dimension against captured outcomes, pattern-discovery analytics on co-occurrence across the 88 dimensions, archetype-weight recalibration on a quarterly cadence, dimension retirement when predictive validity collapses. The engine is the only part of the product that is not visible to customers; it is the part that compounds.

Adjacent platforms can ship the middle (a scoring dashboard). The left edge (outcome-calibrated pattern library) requires a customer base and outcome telemetry. The right edge (agent-workflow embeddedness) requires the integration footprint and the distribution mechanics. The composition is the moat.

06 Pattern definitions

Three patterns ship today. Each is a typed predicate over the cross-framework score vector. Each has a defined definition, a defined cross-framework signal, and a defined coaching response. The library is designed to expand by validated discovery, not editorial publication. Target: 8+ validated patterns by Q4 2026, continuous expansion thereafter.

Pattern 01

Translation Gap

Definition. High efficiency, low impact. The team is shipping fast but not shipping right.

Cross-framework signal. High Team Operations score on velocity, ritual, and cycle dimensions paired with low Product Assessment score on outcome-coupled dimensions (workflow embeddedness, retention loop coverage, expansion vector clarity).

Coaching response. Push outcome instrumentation into the Development Lifecycle. Move ranked actions toward outcome capture and feedback-loop wiring rather than process optimization. The team does not need more cadence; it needs more signal.

Pattern 02

Fragility Signal

Definition. Cycle health drift. Team velocity is dropping or rework rate is climbing while operating cadence holds steady.

Cross-framework signal. Development Lifecycle deterioration (Validate, Ship phase scores trending down across consecutive scoring runs) coupled with Team Operations communication breakdown (feedback loop quality, escalation cadence, retro discipline drifting from prior baseline).

Coaching response. Surface the drift early. Push ranked actions toward debugging the build cycle (prompt drift, eval cadence, regression suite health) and toward repairing communication channels before the operating cadence catches up to the underlying drift.

Pattern 03

Compound Ready

Definition. The team is poised to multiply. High Team Operations, Development Lifecycle, and Product Assessment scores across the board, with at least one prior pattern detection retiring.

Cross-framework signal. All three frameworks above 70. Composite stage at Lead or higher. At least one previously firing pattern (Translation Gap, Fragility Signal, or any pattern added to the library after) now retired by the most recent score.

Coaching response. Shift ranked actions from defense to offense. Push toward distribution mechanics, agent invocability, and substrate compounding. The team is not in repair mode; it is in multiplication mode.

07 Substrate thesis

Three substrates fuse beneath the diagnostic. The fusion is the moat. Single-source competitors (a Linear-only dashboard, a GitHub-only analytics tool, a manual maturity quiz) cannot reproduce a fused substrate by adding more of their own substrate; the fusion requires sources that are categorically different from each other.

Substrate 01

Operational adapters

54 integration providers across 12 categories. Each implements the ProviderAdapter interface (packages/core/src/integrations/). Each normalizes to the shared signal taxonomy. Live in production today: GitHub, Linear. OAuth registered: Slack, Jira, PostHog, Figma, Attio, Sentry, Vercel. Roadmap targets the long tail. The scoring engine is provider-agnostic; it operates on signal types, not provider identities.

Substrate 02

Public web

URL-anchored crawls of competitor moves, public roadmaps, peer framework publications, hiring trends, and community signal. Fed through public_web_extract trace capture, with full prompt text stored for replay. The same product surface that scores a team also scores its public footprint, which feeds Product Assessment dimensions directly.

Substrate 03

Tribal knowledge

Teach DAC capture (in-product mechanism for tribal notes, decisions-not-in-tickets, team history). Slack channel ingestion (in flight, three-tier hybrid: MCP v1, opt-in ingestion, enterprise full Events API). The substrate that no public-API-only competitor can reach.

Why the fusion is non-replicable: a competitor with operational adapters alone misses the public-web context that explains what the market is doing around the team. A competitor with public-web crawl alone has no per-team signal. A competitor with tribal-knowledge capture alone has no operational signal. The three together, scored against the same dimensional vocabulary, is the company.

08 Signal taxonomy

The signal taxonomy is the contract between the substrate and the scoring engine. Every adapter, every public-web extractor, and every tribal-knowledge capture normalizes to a typed signal in the shared taxonomy. The scoring engine never sees provider-specific shapes; it sees signals.

SIGNAL_TYPES. Approximately 178 predefined signal types. Source of truth: packages/shared/src/integrations/integrations-signals.ts. Each type has a stable string identifier, a category label, and a free-text description. New types are added by extending the union; never inline.
SIGNAL_DIMENSION_MAP. Approximately 347 entries mapping signal_type to one or more of the 27 Team Operations dimensions. Single source of truth. Core re-exports from shared. Never define a new mapping anywhere else.
Orphan signals. Eight signal types are not mapped to any dimension. Flagged with inline ORPHAN: comments in the taxonomy file and exported as ORPHAN_SIGNAL_TYPES. Surfaced in the user-facing /signals catalog so customers can see what is captured but not yet scored.
Categories. The 178 types are grouped into 11 categories surfaced in the /signals catalog: cycle metrics, feedback loops, ritual hygiene, knowledge capture, planning discipline, escalation patterns, observability, build-cycle health, ship cadence, distribution mechanics, retention loops.
Cross-source lineage. cross-source-signals.ts exports feedback_to_feature_lineage, a pure-logic compute that traces a feedback signal to its corresponding feature signal (conservative substring-keyword heuristic). The lineage layer is what allows Translation Gap to fire on the cross-framework signal rather than within a single framework.

Provider-agnostic scoring is what makes the engine portable. Adding a new provider does not require touching the scorer; it requires implementing ProviderAdapter and wiring outputs into existing signal types. New signal types require extending the shared union and the dimension map; the scorer adapts automatically.

09 Scoring methodology

The full diagnostic runs as a six-step pipeline. End-to-end target latency is under 60 seconds. End-to-end COGS per score is approximately $0.17 (inference plus adapter compute plus storage). The pipeline is deterministic in its structure; LLM calls are the only non-deterministic step, and every LLM call is captured in the observability trace.

URL crawl plus adapter enrichment fetch. The product URL is crawled. Connected adapters fetch their signal payloads. Public-web context is layered over the crawl output. All three substrates produce a unified payload for the scorer.

Signal extraction. Adapter payloads, crawl text, and tribal-knowledge captures are normalized to typed signals in the shared taxonomy. The scorer never sees raw provider shapes.

Dimension scoring per framework. Each of the 88 dimensions is scored independently. Model: claude-sonnet-4-5. Each call uses captureTrace() with callType: "score". Prompts are tight and rubric-anchored; full prompt text is stored for replay.

Composite stage computation. Per-framework stages roll up to the composite verb ladder (React, Augment, Orchestrate, Lead, Compound). Composite scoring uses fixed bands (0 to 40, 40 to 55, 55 to 70, 70 to 85, 85 to 100) modulated by archetype.

Pattern detection. Pattern predicates run against the cross-framework score vector. Translation Gap, Fragility Signal, Compound Ready, plus any newly validated patterns. Each firing pattern emits its named identifier, its supporting dimension list, and its archetype-conditioned coaching payload.

Coaching generation. Ranked actions are generated per-customer per-score with LNO classification and archetype conditioning. callType: "dac_structured" for primitive emission, callType: "dac_chat" for the streaming coach. Both capture traces. Both flow trace_id through to recommendation_feedback for outcome attribution.

Latency target: under 60 seconds end-to-end. COGS per score: approximately $0.17. Pro tier blended COGS per month: approximately $32.50 on a $299 list price. Blended gross margin target: 78 to 82 percent at current cost structure; 92 percent or better post-fine-tune at the 2,000-customer inflection.

10 Observability contract

The llm_traces table is the single source of truth for every production LLM call. The contract is documented in .claude/rules/observability.md and enforced at PR review. Every LLM call site captures a trace; no exceptions.

Capture helper

captureTrace() from @dacard/core/telemetry. Never throws. Safe to call from any code path. Adds zero perceived latency (fire-and-forget). Required fields: userId, productId, callType, model, provider, inputTokens, outputTokens, latencyMs, promptText, responseJson. Optional: accountId.

Call type vocabulary

callType	Use for
`dac_chat`	Streaming DAC coach (/api/chat SSE path)
`dac_structured`	Structured output (/api/chat with primitive blocks)
`score`	Full-framework scoring (scorer/llm.ts)
`pa_assessment`	Product Assessment LLM
`lifecycle`	Lifecycle assessment LLM
`judge`	LLM-as-judge calls (so we track judge drift)
`public_web_extract`	Public-web crawl signal extraction
`slack_classify`	Slack message tribal-knowledge classification
`milestone`	Milestone artifact generators (day-1 read, week-1, day-30/60/90)
`other`	Fallback. Flagged in review if it shows up in production.

Outcome chain

When a customer takes action on a coaching response (thumbs up/down, accept, reject, regenerate), traceId flows through to saveRecommendationFeedback. The results.trace_id column ties score rows to the LLM call that produced them. computeTraceOutcome() in packages/core/src/intelligence/trace-outcome.ts joins trace, feedback, and score-delta. Dashboard aggregates use the helper, never direct SQL.

Privacy posture

Full prompt text is stored for replay. llm_traces is scoped by product_id and account_id; user data follows the same tenant boundary as scores. Never log Clerk tokens, API keys, or secret material in promptText or requestJson. Suspect fields are redacted before captureTrace.

What breaks if the contract is skipped: cannot replay failed or low-quality calls against improved prompts; cannot judge coaching quality (nothing to feed the judge); cannot attribute outcomes (was the accepted recommendation actually correlated with score movement?); cannot debug prompt drift when model versions change. Observability is the foundation of the eval loop.

11 Eval framework

The eval framework runs at scripts/eval.ts. 16 categories, 75+ checks. Categories 15 and 16 specifically gate coaching output quality. The full quality gate runs before every ship: typecheck, vitest, eval, build, lint. All five must pass.

Category 15: Golden Fixture Set. Structural validation against canonical scoring scenarios in packages/core/src/eval/golden-fixtures.ts. Every new primitive type requires at least one fixture exercising it. Catches regressions in scoring shape, primitive emission, and ranked-action structure.
Category 16: LLM-as-judge. Cached judge results live in evals/golden/judge-cache.json. The cache is refreshed when the DAC system prompt changes (npx tsx scripts/refresh-judge-cache.ts). Refresh cost: approximately $1.20 per full set; per-sample cost approximately $0.02. Nightly cron caps at 50 judgments (approximately $1 per day max).

Quality gate (run before every ship)

npx pnpm -r --parallel typecheck
npx vitest run
npx tsx scripts/eval.ts
npx pnpm --filter @dacard/web build
npx pnpm --filter @dacard/web lint

Rules of engagement. Never adjust Category 16 thresholds to mask a regression. If judge scores drop, fix the prompt or accept the truth. If a new primitive ships without a golden fixture, the build fails. If the system prompt changes without a judge-cache refresh, the build fails.

12 Pattern-discovery engine

Plan documented in plans/pattern-discovery-instrumentation.md. The engine analyzes co-occurrence across the 88 dimensions, archetype context, and captured outcome data. New patterns are validated empirically; they are not assembled by editorial committee.

Pipeline

Co-occurrence analysis. For each pair (and triple) of dimensions across frameworks, compute joint distribution of scores conditioned on archetype.
Outcome correlation. Test whether a co-occurrence predicts a captured outcome (NRR delta, ranked-action acceptance rate, reported team performance).
Power thresholds. Candidate co-occurrences must clear a defined statistical-power bar before they enter the candidate queue. Threshold scales with customer-base size; current threshold is conservative.
Naming review. Validated candidates enter naming review. Naming uses the same vocabulary discipline as the existing pattern library: trademarkable thought-territory, peer-voice, framework-anchored.
Publication cadence. Quarterly pattern-validity reports. Q3 2026 internal report. Q4 2026 first public report. Continuous public publication thereafter.

The engine is the empirical moat. The customer base produces the calibration data. The calibration data trains the engine. The engine surfaces patterns. The patterns expand the surface buyer-job count. Each loop tightens the others. This is the part of the architecture that competitors cannot replicate by reading the architecture document.

13 AI economics

Pricing tiers are documented in the memo. This section is the cost side. Internal accounting uses a credit system that is never exposed to customers. The credit system exists for capacity planning, COGS forecasting, and per-customer profitability tracking.

Operation	Credits	Notes
Score (full diagnostic across 88 dimensions)	10	URL crawl plus adapter enrichment plus per-dimension scoring plus pattern detection plus coaching generation.
Chat (DAC coaching turn)	1	One streaming response, traced.
API (MCP tool call)	0.1	Programmatic invocation, lighter prompts, cached where deterministic.

Cost guardrails

Prompt caching. Signal bundles are cached aggressively across scoring runs for the same product. Repeated dimension prompts with stable signal payload hit cache.
Batch API. Overnight calibration runs and judge cache refreshes use the Anthropic Batch API where latency is not user-facing.
Sampling. Call types expected to exceed 100k per day are reviewed for sampling at capture time. Today, none meet that threshold.

Margin curve

Window	Blended gross margin	Mechanism
Year 1	78 to 82 percent	Inference in COGS, no fine-tune yet, integrations cost expanding linearly.
Year 2 (post fine-tune)	92 percent or better	Inflection at approximately 2,000 customers. Fine-tune payback in 3 to 6 months.
Year 3 (envisioned)	94 percent or better	Sub-linear calibration cost on snapshot data, prompt caching maturity, Batch API saturation on overnight runs.

Pricing reference (full bridge in the memo and the model): Free, Pro $299, Business $1,200, Enterprise $2,500+. The cost structure documented here is what makes those tiers operable at the stated margin curve.

14 Tech stack reference

Production stack as of May 2026. Status column reflects production reality, not roadmap. "In queue" means scoped and prioritized but not yet shipped. "When admitted" means dependent on partner program eligibility.

Layer	Tech	Status
App framework	Next.js 14 (App Router)	Live
Auth	Clerk	Live
Database	Turso (libSQL)	Live
Inference	`claude-sonnet-4-5` primary, fallback tested	Live
Billing	Stripe	Live
Errors	Sentry	Live
Deploy	Vercel	Live
MCP server	Distribution surface	Live
Agent skill	Distribution surface	Live
REST API	Programmatic access	Live
Claude Code plugin	Distribution surface	In queue
Apps on Claude.ai	Distribution surface	When admitted

Package boundaries (never violated): shared imports nothing (leaf package). core imports only @dacard/shared. web imports @dacard/core, @dacard/shared, and @/lib. Client components never import from @dacard/core (causes webpack native binary bundling).

15 Glossary of acronyms

DAC: The Dacard coach. Peer surface in the product. Refers to the coaching system as a whole, not a specific endpoint.
ICP: Ideal Customer Profile. VP Product / Head of Product / CPO at post-Series A B2B SaaS, 50 to 200 employees, Linear user.
PLG: Product-Led Growth. Self-serve acquisition motion seeded by product usage and graduating to seat-based expansion.
NRR: Net Revenue Retention. Cohort revenue this period divided by cohort revenue last period including expansion, contraction, and churn.
CAC: Customer Acquisition Cost. Fully loaded sales and marketing spend divided by new customers acquired in the same period.
LTV: Lifetime Value. Expected total revenue from a customer over the relationship.
MRR: Monthly Recurring Revenue.
ARR: Annual Recurring Revenue.
COGS: Cost of Goods Sold. Inference, adapter compute, storage, and other variable costs of serving a customer.
TO: Team Operations framework. 24 dimensions. Cross-functional team effectiveness.
DL: Development Lifecycle framework. 34 dimensions. AI-native build cycle.
PA: Product Assessment framework. 27 dimensions. The product itself.
SAM: Serviceable Addressable Market. The slice of TAM Dacard can credibly reach with current GTM motions.
TAM: Total Addressable Market. The full economic surface for the category.
SOM: Serviceable Obtainable Market. Near-term beachhead capture.
MCP: Model Context Protocol. Anthropic's open protocol for tool invocation. Dacard ships an MCP server.
OAuth: Open Authorization. Industry-standard delegated access protocol used by every adapter integration.
OTel: OpenTelemetry. Vendor-neutral observability standard. Dacard's llm_traces table mirrors OTel conventions where applicable.
JTBD: Jobs-to-Be-Done. Outcome-anchored framing for buyer motivation; central to the ICP definition.
LNO: Leverage / Neutral / Overhead. Action-classification scheme used by ranked actions. Leverage actions multiply other work; overhead actions consume capacity without compounding.

Questions on methodology? Reach out to darren@dacard.ai. The investment memo lives at investor-memo.html; the technical architecture deep-dive lives at investor-tech.html.