Engineer Burnout Metrics: How to Detect and Prevent Engineering Team Exhaustion
Burned-out engineers are 2.6× more likely to resign — yet most engineering leaders notice the problem only after the resignation letter lands. That's because burnout in software engineers does not look like low output. It looks like high output, until it suddenly stops. This guide covers the observable engineer burnout metrics you can measure from your existing GitHub data, and what to do when you find them.
What this guide covers
Why burnout is uniquely invisible in engineering, the six observable data signals that precede it, why DORA metrics miss burnout entirely, how to tell crunch from chronic exhaustion, an operational playbook for intervention, and how Koalr surfaces burnout risk automatically from GitHub and Slack activity.
The Hidden Cost of Engineering Burnout
McKinsey's 2023 research found that burned-out employees are 2.6× more likely to leave their employer. In software engineering — where replacing a mid-level engineer costs 50–200% of annual salary once you factor in recruiting, onboarding, and the months of reduced productivity on both sides — that attrition multiplier is not a soft people-issue. It is a balance-sheet event.
Gallup's research on active disengagement puts the cost at roughly $3,400 per $10,000 of salary: a $180,000 senior engineer who is burned out and quietly disengaged is costing the organization approximately $61,200 per year in lost productivity before they have even handed in their notice.
What makes engineering burnout uniquely difficult is that its early stages are indistinguishable from high performance. A burned-out engineer typically increases their commit volume, works later hours, and pushes larger changes in the weeks before the cliff. They are trying harder, not less hard — compensating for a growing sense of falling behind. The throughput drop comes after the sustained surge, not before it. By the time output drops visibly, the resignation decision has often already been made.
Most engineering leaders notice burnout only at the point of resignation. The data to detect it weeks earlier already exists in your GitHub commit logs and PR history. The problem is knowing what patterns to look for.
What Engineering Burnout Actually Looks Like in Data
Burnout measurement does not require a feelings survey, though surveys have their place. The behavioral precursors of burnout leave observable traces in version control. Here are the six signals to watch for:
Commit pattern change
A developer's commit volume spikes 40% or more above their personal 90-day baseline and sustains that elevation for three or more weeks. Alongside the volume spike, late-night commits (after 9pm) and weekend commits appear with increasing frequency. This is the compensatory surge phase — the engineer is working more hours to maintain their output under growing cognitive load. The spike is followed, weeks later, by a throughput cliff where output drops below their own pre-surge baseline.
PR size explosion
PRs get larger and less careful. Lines of code per PR increase while review request breadth decreases — the engineer stops tagging multiple reviewers and starts requesting the minimum. PR descriptions become shorter or disappear entirely. Larger, sloppier PRs under time pressure are both a signal of burnout and an amplifier of it: they take longer to review, accumulate feedback, and create the rework cycles that deepen the exhaustion.
Review quality drop
When engineers are overloaded, code review is the first thing to go. Watch for a drop in inline comments per review — a developer who normally leaves 8–12 comments per review dropping to 1–2 — and a dramatic reduction in time-to-first-review. Paradoxically, faster reviews are a warning sign here: they indicate rubber-stamping rather than genuine engagement. A burned-out engineer is approving PRs to clear the queue, not to ensure code quality.
Rework rate spike
PRs failing CI on first push, PRs with three or more revision cycles before merge, and an increase in reverted deployments from a specific author are all rework signals. Rework rate measures the ratio of revision effort to initial effort — a rising rework rate indicates the developer is producing less coherent changes and catching fewer errors themselves before requesting review.
After-hours activity normalization
Occasional late commits are normal. A pattern where more than 25–30% of a developer's commits occur after 8pm or before 7am, sustained across multiple weeks, indicates that after-hours work has become the norm rather than the exception. This is both a signal of current overload and a structural risk — the normalization of after-hours work is one of the primary mechanisms through which individual burnout becomes team culture.
Throughput cliff
After a sustained above-normal velocity period, a developer's productivity drops below their own pre-surge baseline — not just returning to normal, but undershooting it. The cliff is the most recognizable signal, but it is also the latest: by the time the drop is visible, the damage to engagement and retention is already done. Leading indicators matter precisely because they precede the cliff by weeks.
5 Leading Indicators You Can Measure from GitHub Data
The following five signals can be derived from commit timestamps and PR metadata without any additional tooling. Each has an approximate burnout threshold — not a hard cutoff, but a level at which the pattern warrants a conversation.
| Signal | What to measure | Burnout threshold |
|---|---|---|
| Late-night commit rate | % of commits pushed after 9pm (author local time) | >30% for 2+ weeks |
| Weekend commit rate | % of commits on Saturday or Sunday | >20% for 3+ weeks |
| PR rework cycles | Avg request-changes rounds per PR (vs personal baseline) | >2.5× personal baseline |
| Review turnaround drop | Median time-to-first-review given by this author | Drops >50% vs 90-day avg |
| After-hours PR opens | % of PRs opened between 8pm and 7am | >25% sustained over 2+ weeks |
These thresholds are relative, not absolute. A developer who has always committed 25% of their work on weekends — because that is genuinely when they prefer to work — is not a burnout signal. The detection logic must compare each developer against their own historical baseline, not against a team average or a fixed rule. Deviation from personal norm is the signal; the norm itself is not.
Why DORA Metrics Alone Miss Burnout
DORA metrics — Deployment Frequency, Lead Time, Change Failure Rate, MTTR — are lagging indicators of team system health. They are excellent for measuring delivery pipeline efficiency, but they are structurally blind to individual engineer sustainability. A burned-out team can maintain DORA numbers while accruing catastrophic unseen debt.
Here is the sequence: a team under pressure maintains Deployment Frequency by rushing changes through (the engineer is working more, not less). Lead Time stays low because the same overloaded engineers are also rubber-stamping reviews faster. Change Failure Rate begins to creep up as review quality drops and code coherence deteriorates — but CFR is a trailing indicator that typically takes weeks of degraded quality to register. By the time CFR is visibly elevated, three engineers may have already submitted their resignations.
The failure mode is that DORA numbers can look better than ever in the weeks immediately before a burnout-driven attrition wave: high frequency, low lead time, CFR still within normal range. The throughput cliff and CFR spike happen after the damage is done. Leading indicators that operate at the individual developer level — not the team aggregate — are what catch this in time to intervene.
Burnout vs. Crunch: How to Tell the Difference
The key distinction
Crunch is time-bounded and team-agreed. Burnout is chronic and individual. Crunch resolves when the sprint ends. Burnout compounds with every additional week. The data pattern that distinguishes them is whether the whole team moves together and returns to baseline — or whether one developer diverges from the team and keeps going.
In a genuine crunch period — a launch sprint, a critical incident response, a regulatory deadline — the data looks like this: the whole team's commit velocity and after-hours activity rise together, and within one to two weeks after the deadline passes, the metrics return to each person's personal baseline. The signal is synchronized across the team and self-limiting.
Burnout looks different in data. A specific developer's metrics diverge from the team pattern. While the rest of the team returns to baseline after the sprint, this developer's after-hours commits continue or increase. Their rework rate keeps climbing while the team's stabilizes. The divergence is individual, not collective.
The statistical threshold that Koalr uses internally: a developer who is two standard deviations above their personal 90-day commit baseline for four or more consecutive weeks while the team average is within one standard deviation of its baseline is a burnout risk flag, not a crunch flag. The duration and the individual divergence are both required to trigger the signal.
What to Do When You Detect Burnout Signals
Detection without action is worse than no detection — it creates the impression that the organization sees the problem and has chosen not to address it. When burnout signals appear in your data, the following playbook gives engineering managers a structured response.
Burnout intervention playbook
Do not confront with data
The commit patterns are a signal to have a conversation, not evidence for a performance review. Lead with genuine curiosity about workload and energy levels. Presenting the data directly often triggers defensiveness and damages psychological safety. The metrics told you when to have the conversation — they should not be the content of it.
Redistribute review load
Burned-out developers frequently accumulate review queues that nobody else has cleared. Check the review load distribution across the team. If one developer is handling 40%+ of all review requests while others handle 10%, rebalance the ownership explicitly — not as a quiet process change, but as an acknowledged reduction in load.
Enforce PR size limits
Large PRs under time pressure are a burnout amplifier. A 1,200-line PR accumulates review feedback over days, generates multiple revision cycles, and creates mounting context-switching cost. Set an explicit team norm — 400 lines as a soft ceiling, 600 as a hard ceiling — and enforce it in code review culture, not just tooling.
Audit after-hours work culture
After-hours commit normalization is structural, not individual. If your Slack shows the team lead posting at 11pm and engineering management responding within minutes, the cultural signal is that after-hours responsiveness is expected. Audit this explicitly: review Slack activity heatmaps and commit timestamps across the whole team, not just the individual flagged.
Track a 30-day recovery metric
After any intervention, track the developer's commit velocity, after-hours rate, and rework rate for 30 days. Recovery — returning to their personal pre-surge baseline — is the goal, not a return to the elevated pace. If metrics do not improve within 30 days of a load reduction, the structural cause has not been addressed and the intervention needs to go deeper.
How Koalr Surfaces Burnout Risk
Koalr surfaces engineer burnout metrics through three interconnected features designed for engineering managers who want signal, not noise.
The Well-Being Tracker sends anonymous Slack micro-surveys — one to three questions, weekly — covering five pillars: workload, autonomy, recognition, growth, and psychological safety. Responses are aggregated at the team level so no individual response is identifiable, giving engineers genuine safety to be honest. Trends over time surface whether a team's well-being is improving or degrading week over week.
The Work Log heatmap renders a contribution graph — similar to GitHub's contribution calendar — but filtered by author, time of day, and day of week. An engineering manager can immediately see whether a developer's commits are concentrated in off-hours, whether the pattern has changed recently, and how an individual compares to the rest of the team. Over- concentration is visually obvious in the heatmap in a way that aggregate statistics obscure.
The rework rate per developer view shows how each engineer's revision cycle frequency has trended over the past 30, 60, and 90 days — and flags individuals whose rework rate has increased more than one standard deviation above their personal baseline. This is the data signal most directly correlated with the "trying harder but producing less coherently" pattern that precedes burnout.
For engineering managers who want to ask direct questions, Koalr's AI chat can answer queries like "who on the team is working the most unsustainable hours?" against real commit and PR timestamp data — returning a ranked list with the supporting evidence. The same query can be scoped to a team, a time window, or a specific repository.
These features sit inside the broader engineering manager use case in Koalr — alongside DORA tracking, PR flow metrics, and deployment risk scoring — so burnout signals are visible in the same context as delivery performance, not siloed in a separate HR tool.
Monitor your team's sustainability with Koalr
Koalr surfaces engineer burnout metrics automatically — commit pattern deviations, after-hours activity trends, rework rate spikes, and anonymous well-being surveys — so you catch exhaustion before it becomes resignation.