02 · Process · Development Lifecycle

The Development Lifecycle Framework

34 tasks across 6 stages. The lifecycle for building AI-native products with AI-native teams. Each stage names a discipline, each task names a concrete practice, each subtask is testable.

Tasks

Stages

Locked Q2 2027

✦Six stages of the lifecycle

Stage 01

Specify & Constrain

The spec IS the implementation instruction

The spec is the implementation instruction. Structured specs, harness constraints, measurable acceptance criteria.

Stage 02

Build the System of Context

Your context is your moat

Your context is your moat. Curated knowledge, multi-model routing, architectural constraints as living context.

Stage 03

Orchestrate & Generate

Type less. Think more.

Type less. Think more. Parallel agent delegation, mission control patterns, scope boundaries, token budgets.

Stage 04

Validate, Eval & Craft

Truth metrics over vanity metrics

Truth metrics over vanity metrics. Eval pipelines before generation pipelines. Counter-metrics. Craft review.

Stage 05

Ship & Manage Economics

Token budgets alongside sprint budgets

Token budgets alongside cycle budgets. Cost-per-action tracking, model routing, pricing alignment, version pinning.

Stage 06

Learn & Compound

Every cycle makes the next one faster

Every cycle makes the next one faster. Retros for AI workflow, emergence rate measurement, cognitive debt tracking.

✦Cross-cutting concerns

Three concerns run across all six stages. They are not stages themselves, but if you ignore them, every stage degrades.

Token Economics

Inference cost impacts every stage. Architecture decisions have cost implications. Sprint planning needs token budgets. Production needs cost monitoring. Pricing needs margin visibility.

Impacts: Architecture · Sprint Planning · Production · Pricing

Role Fluidity

PMs write specs that are prompts. Engineers orchestrate agents instead of writing code. Designers validate craft quality. Traditional role boundaries blur. Context and judgment matter more than titles.

Impacts: Hiring · Team Structure · Performance Reviews · Career Ladders

Cognitive Debt

The hidden cost of poorly managed AI interactions. Time spent re-prompting, assembling context manually, and reviewing low-quality output. May exceed technical debt in impact by 2027.

Impacts: Developer Productivity · Context Quality · Team Velocity · Burnout

✦Stage 01 of 06

Specify & Constrain

The spec IS the implementation instruction

Write structured specs with explicit acceptance criteria, preconditions, and examples. Define harness constraints - what agents can and cannot touch, and the patterns they must follow.

Evidence

Martin Fowler and OpenAI both confirm: harness engineering keeps agents productive. The constraint layer is where humans add the most value.

✦All 6 tasks at the Specify & Constrain stage

01Specify & Constrain

Create structured spec template

Design a spec format that doubles as an agent prompt: preconditions, acceptance criteria, input/output examples, anti-examples. Machine-readable, not narrative prose.

02Specify & Constrain

Define harness constraints document

Document what agents can/cannot modify, architectural patterns they must follow, file boundaries, and safety rails. The harness matters more than the prompt.

03Specify & Constrain

Set measurable acceptance criteria per feature

Every spec needs quantifiable pass/fail criteria. 'Improved performance' is not a criterion. '< 200ms p95 latency' is. Vague specs produce vague outputs.

04Specify & Constrain

Write anti-examples for critical paths

Define what NOT to do for each major feature area. Anti-examples prevent agents from taking common wrong paths and reduce iteration cycles.

05Specify & Constrain

Version specs alongside code

Store specs in version control next to the code they describe. Specs are living artifacts that evolve with the product, not static documents in a wiki.

06Specify & Constrain

Establish spec review process

Specs get reviewed before generation begins. A 30-minute spec review prevents 3 days of agent re-work. Include PM, engineering lead, and domain expert.

“The spec IS the implementation instruction”

✦Stage 02 of 06

Build the System of Context

Your context is your moat

Context engineering replaces architecture docs. Curate what agents know, select models per task, and define architectural constraints as living documentation.

Evidence

ICONIQ research: 49% of AI companies differentiate through application-layer innovation, only 14% through proprietary models. Context is the leverage point.

✦All 5 tasks at the Build the System of Context stage

07Build the System of Context

Define context hierarchy

Establish three tiers: project-level context (architecture, conventions), feature-level context (domain rules, dependencies), and task-level context (specific requirements, examples).

08Build the System of Context

Build context indexing pipeline

Set up embeddings and vector storage for your codebase, docs, and domain knowledge. Agents should retrieve relevant context automatically, not rely on manual copy-paste.

09Build the System of Context

Implement multi-model routing rules

Not every task needs the most expensive model. Define routing: fast models for boilerplate, frontier models for architecture decisions, specialized models for domain tasks.

10Build the System of Context

Document architectural constraints as context

Replace static architecture docs with living constraints that agents consume directly. Include: tech stack decisions, naming conventions, file structure rules, API patterns.

11Build the System of Context

Establish context pruning schedule

Context degrades over time. Set a cadence (weekly or per-sprint) to review, update, and prune stale context. Outdated context is worse than no context.

“Your context is your moat”

✦Stage 03 of 06

Orchestrate & Generate

Type less. Think more.

Orchestrate agents so output is coherent and architecturally sound. The developer's job shifts from writing code to directing agents while maintaining architectural judgment.

Evidence

Cursor CEO Michael Truell warns against 'shaky foundations' - structure matters more, not less, when agents generate the code.

✦All 5 tasks at the Orchestrate & Generate stage

12Orchestrate & Generate

Set up parallel agent delegation workflow

Define how to decompose features into parallel agent tasks. Each task should be independently executable with clear boundaries. Resolve merge conflicts as a human-in-the-loop step.

13Orchestrate & Generate

Establish mission control pattern

One human orchestrator manages multiple agent threads. Track what each agent is working on, what's blocked, and what needs human architectural decisions.

14Orchestrate & Generate

Define scope boundaries per agent task

Each agent task gets explicit file/module boundaries. Overlapping scope between agents creates merge hell. Scope isolation is non-negotiable.

15Orchestrate & Generate

Set token budgets per task type

Establish token spend limits by task category. Bug fix: X tokens. New feature: Y tokens. Refactor: Z tokens. Track actual vs. budget to calibrate over time.

16Orchestrate & Generate

Reserve architectural decisions for humans

Agents generate code. Humans make architectural calls: when to abstract, when to duplicate, when to take on tech debt intentionally. Document these decisions.

“Type less. Think more.”

✦Stage 04 of 06

Validate, Eval & Craft

Truth metrics over vanity metrics

AI-generated code has 1.7x more major issues and 2.74x more security vulnerabilities. Validation is where you earn quality. Distinguish functional correctness from craft quality.

Evidence

CodeRabbit analysis of 1M+ PRs: AI code has 1.7x more major issues, 2.74x more security vulnerabilities. Validation isn't optional - it's the bottleneck.

✦All 6 tasks at the Validate, Eval & Craft stage

17Validate, Eval & Craft

Build eval pipeline before generation pipeline

You cannot improve what you cannot measure. Stand up automated eval (correctness, security, performance, style) before scaling agent-generated output.

18Validate, Eval & Craft

Define truth metrics per feature area

Truth metrics measure real outcomes, not activity. Shipping velocity means nothing if regression rate climbs. Pair every speed metric with a quality counter-metric.

19Validate, Eval & Craft

Implement the Intercom counter-metric pattern

For every metric you optimize, track its counter-metric. Ship faster → track regression rate. Reduce cost → track quality score. This prevents optimizing into a corner.

20Validate, Eval & Craft

Separate functional review from craft review

Functional: does it work correctly and securely? Craft: is it elegant, maintainable, consistent with product quality bar? Both matter, but they're different review passes.

21Validate, Eval & Craft

Set up security scanning for AI-generated code

AI code has 2.74x more security vulnerabilities. Run SAST/DAST on every agent-generated PR. Flag patterns like hardcoded secrets, SQL injection, XSS, and auth bypasses.

22Validate, Eval & Craft

Establish design review for craft differentiation

AI can match 80% of quality. The last 20% - the craft that differentiates your product - requires human design review. Schedule these as a deliberate practice.

“Truth metrics over vanity metrics”

✦Stage 05 of 06

Ship & Manage Economics

Token budgets alongside sprint budgets

A stage that didn't exist in traditional SDLC. Inference costs can jump from $200/month in development to $10,000/month in production. Economics are a first-class engineering concern.

Evidence

Kyle Poyar documented 1,800+ pricing changes among top 500 SaaS/AI companies in 2025. Credit-based models jumped 126% YoY. Pricing is product strategy.

✦All 6 tasks at the Ship & Manage Economics stage

23Ship & Manage Economics

Implement cost-per-action tracking

Measure what each AI feature costs per user action, not just aggregate token spend. A $0.03 action at 10K DAU is $300/day. Make cost visible to engineering and product.

24Ship & Manage Economics

Build inference cost dashboard

Real-time visibility into token spend by feature, model, user tier, and action type. Shared across engineering, product, and finance. No surprises in the cloud bill.

25Ship & Manage Economics

Implement tiered model routing for production

Route requests to the cheapest model that meets quality requirements. GPT-4 for complex reasoning, GPT-3.5 for classification, local models for formatting. Save 60-80% on inference.

26Ship & Manage Economics

Align inference costs with pricing tiers

Your AI features have marginal cost. Price accordingly: credit-based, usage-based, or tier-gated. Don't eat inference cost on your lowest tier.

27Ship & Manage Economics

Pin model versions in production

Model updates can silently change behavior. Pin specific model versions, test before upgrading, and maintain rollback capability. Treat model changes like dependency updates.

28Ship & Manage Economics

Set per-customer token budgets

Heavy users can blow through your margin. Set per-customer or per-tier token budgets with graceful degradation when limits are reached. Rate-limit expensive operations.

“Token budgets alongside sprint budgets”

✦Stage 06 of 06

Learn & Compound

Every cycle makes the next one faster

The flywheel stage. Feed outcomes back into context, harness constraints, and delegation patterns. Teams that compound learn faster than teams that just ship faster.

Evidence

Dan Shipper at Every: 15 people, 5+ products, 7-figure revenue, 100% AI-written code - via compounding engineering, not heroic effort.

✦All 6 tasks at the Learn & Compound stage

29Learn & Compound

Implement post-cycle retrospective for AI workflow

After each sprint/cycle: What specs produced best agent output? Where did context fail? What harness constraints need updating? This is your compounding mechanism.

30Learn & Compound

Measure Emergence Rate

Track cycle velocity improvements over time. If each cycle isn't getting faster or higher-quality, your compounding loop is broken. This is the single most important AI-native metric.

31Learn & Compound

Build library of proven spec templates

When a spec produces excellent agent output, templatize it. Over time, your spec library becomes your competitive advantage - new features start from proven patterns, not blank pages.

32Learn & Compound

Track and reduce cognitive debt

Cognitive debt (Karpathy): the hidden cost of poorly managed AI interactions. Track: time spent re-prompting, context assembly time, review cycles per feature. Reduce systematically.

33Learn & Compound

Update harness constraints from production data

Production incidents and quality issues feed back into harness constraints. Every bug prevented by a constraint is a cycle saved in the future.

34Learn & Compound

Prune and refresh context quarterly

Context degrades. Stale embeddings, outdated conventions, deprecated patterns - all poison agent output. Schedule quarterly context audits like you schedule dependency updates.

“Every cycle makes the next one faster”

Score how your team builds against all 34 lifecycle tasks.

Start your first read →Scrub a real read

Free to start. Sign up in 60 seconds.