What Anthropic's Engineering Research Tells You About Your Own Team
Anthropic studied their own engineers. 27% net-new work, personal delegation heuristics with no shared framework, supervision paradox. The data makes the case for measurement.
Anthropic published internal research in December 2025 on how AI is transforming work at Anthropic. They studied their own engineers. The findings map directly to what every AI-native product team is navigating. They did not frame it that way, but the data makes the case.
Here is what they found, and why it matters.
Finding 1: 27% More Work That Would Not Exist Without AI
Anthropic engineers report that 27% of their current work is net-new, not existing work done faster. That is work that would not have been attempted without AI assistance.
More work means more decisions. Not just more of the same decisions, but genuinely new decisions in new territory. No historical precedent. No established team judgment. No shared framework for evaluating quality.
This is not a speed problem. It is a decision-volume problem.
Finding 2: Personal Delegation Heuristics With No Shared Framework
Engineers at Anthropic have developed individual heuristics for what to delegate to AI. Which tasks go to Claude. Which they hold. When to verify. When to trust.
These heuristics are personal, uncalibrated, and invisible to the rest of the team. Six engineers on the same team have six different frameworks. None of them have been validated against outcomes.
This is judgment debt forming in real time.
Finding 3: The Supervision Paradox
This one is underappreciated. The more you delegate to AI, the less equipped you become to evaluate its output.
Anthropic's own engineers flagged this. As they handed off more complex tasks, their ability to catch errors in AI-generated output decreased. The skills that atrophy fastest are exactly the ones needed to supervise AI effectively.
Calibrated scoring is not a nice-to-have for AI-native teams. It is organizational infrastructure.
Finding 4: Mentorship Is Declining
Senior engineers at Anthropic report that junior engineers no longer come to them with questions. They go to Claude. The informal knowledge transfer that has always carried institutional judgment is quietly disappearing.
What replaces it? Nothing, unless you build it deliberately. One answer: codify the judgment, make it queryable, track outcomes. That is measurement infrastructure, not tooling.
Finding 5: Everyone Is Full-Stack, Making Decisions Outside Their Expertise
With AI assistance, engineers are doing more product work. PMs are doing more technical scoping. The org chart is blurring. The decision-making responsibility is widening faster than the expertise to support it.
People are making consequential calls in domains where they do not have calibrated judgment. They need a framework. They are not going to build one themselves.
The Measurement Layer
Each finding points to the same gap: teams that are accelerating without a system for calibrating the quality of the decisions they are making at higher volume.
| Research Finding | What the measurement layer addresses | |---|---| | 27% net-new work (more uncalibrated decisions) | Score decision quality across 88 scoring criteria | | Personal delegation heuristics | Calibration dimensions: Spec quality, Decision quality, AI adoption maturity | | Supervision paradox | Track review quality and outcome correlation over time | | Mentorship gap | Codify scoring as shared institutional knowledge | | Everyone full-stack | Framework applies to PMs, engineers, and leads alike |
Why this matters now
AI is not an equalizer. It is a multiplier. Teams that are aligned compound faster. Teams that are misaligned drift faster. The gap is accelerating.
Every person on a team is a vector with magnitude (talent and energy) and direction (alignment to strategy). Anthropic's findings show that magnitude is increasing, but direction is becoming harder to calibrate. More decisions, more delegation, less shared judgment. The resultant vector shortens even as individual capability grows.
The measurement layer for this era is not optional. It is the infrastructure that turns AI-multiplied velocity into AI-multiplied outcomes. That is what DAC is built to be: not a productivity tool, but the calibration layer underneath everything else.
---
*Start DAC's first day at app.dacard.ai. No credit card required.*
*Source: Anthropic internal research, December 2025, n=132 engineers, 53 interviews, 200K Claude Code transcripts.*
Darren Card
Founder, Dacard.ai
See your diagnostic
Free to start. Day-1 read in your inbox.