Theory of Constraints for software delivery: find your bottleneck before hiring more people

The speedup instinct is usually wrong

When delivery slows down, the instinct is to do more of everything. Add a developer. Run more standups. Break tickets smaller. Push the team harder. If some is good, more is better.

Eliyahu Goldratt spent a career arguing this was exactly backwards. His book The Goal, published in 1984 and set in a factory, makes one point over and over: a system's throughput is limited by exactly one constraint at a time. Every resource that isn't the constraint has surplus capacity. Speeding it up doesn't make the system faster. It just builds up a pile of work in front of the actual bottleneck.

Software teams rediscover this constantly. They invest in faster CI pipelines, then the backlog sits in code review for a week. They hire a second QA engineer, but the bottleneck moves to staging environment availability. The constraint shifts. The improvement doesn't stick. The team is baffled.

Goldratt called this the Five Focusing Steps: identify the constraint, exploit it, subordinate everything else to it, elevate it if you have to, then repeat because the constraint will move. You don't need a factory floor to apply this. You need to know where work gets stuck.

What a constraint looks like in software

In a physical factory, the bottleneck is obvious. One machine runs at 70% capacity while every other machine is waiting. You can see the queue building on the floor.

Software work is invisible. Items sit in "In Progress" or "In Review" and nobody has a clear view of how long they've been there, which person is overloaded, or whether the team is blocked on decisions more than execution.

The most common constraints in software delivery aren't what teams think they are. Code review is a frequent one. Not because developers are slow to review, but because too many items are open at once and reviews happen in batches rather than continuously. Staging environment contention is another. So is product approval, architecture sign-off, or the one senior engineer who's a required reviewer on every security-sensitive change.

Research on software team performance consistently shows that teams with the highest throughput aren't the ones working the hardest. They're the ones with the shortest queues and the fewest handoffs. Recent DORA research found that organizations investing in AI coding tools without first addressing their deployment or review bottlenecks often made things worse: more code was generated, but it piled up in the same constrained review and integration stages, making lead times longer, not shorter.

Why optimizing non-bottlenecks is waste

This is the part that feels counterintuitive until you've seen it a few times.

Imagine a team where the bottleneck is code review. Developers are fast. Testing is fast. Deployment is automated. But there are only two senior engineers who do meaningful reviews, and they're also the ones running the architecture discussions, the oncall rotation, and the stakeholder demos.

You decide to hire two more developers. Now there are more pull requests, waiting for the same two reviewers. Lead time goes up, not down. Throughput stays flat. The two new developers are busy, but the system isn't faster.

Goldratt called this "local optimization." You made one part of the system more efficient. But the system's output is determined by its slowest stage, not by the average of all stages.

Exploiting the constraint, before elevating it, means squeezing more out of what you already have. If code review is the bottleneck, you don't immediately hire another senior engineer. You first ask: are reviews waiting because of process, not capacity? Could you review smaller diffs? Could you batch reviews into dedicated morning slots and stop context-switching reviewers? Could you reduce the number of things in progress so fewer reviews are outstanding at once?

Subordinating the rest of the system to the constraint means the developers should probably slow down their output rate to match what the reviewers can absorb. That's also counterintuitive. But starting new work when there's a queue in front of reviewers just grows the queue. It doesn't help.

The Friday status call that hides the bottleneck

In most teams, the ritual that's supposed to surface this information is the weekly status call or the sprint retrospective. "How are we doing? Any blockers? What's slow?"

The problem is that people report on what they feel, not on where the work actually is. A developer who is fast but feeding into a stuck review queue feels productive. A reviewer who is overloaded but managing to work through the queue heroically doesn't want to say "I'm the bottleneck." The retrospective produces observations like "we need better communication" rather than "eight items have been in review for over two weeks."

That's what flow data is for.

Where GoalPath makes the constraint visible

GoalPath's Velocity Analytics page has a section called Flow Metrics. It runs four numbers that, read together, tell you a lot about where your constraint probably is.

The first is Flow Time: median lead time (from creation to done), median cycle time (from started to done), and wait time (how long items sit before anyone touches them). If lead time is long but cycle time is short, items are spending most of their life waiting, either waiting to be started or sitting in review. That gap is the signal.

The second is Flow Efficiency: the ratio of active time to total lead time. Industry benchmarks put the average software team somewhere between 15% and 25%. Below 15% means three-quarters or more of an item's life is waiting, not being worked on. That's not a people problem. That's a queue problem, which is a constraint problem.

The third is Flow Load, which shows total items in progress and breaks them into aging buckets: fresh (under 3 days), normal (up to 7 days), aging (up to 14 days), stale (up to 30 days), and stuck (over 30 days). Stuck items are the most direct signal. An item that's been "in progress" for over a month isn't in progress. It's blocked on something, and that something is probably your constraint. If you have five stuck items and they all belong to the same person, you've found the overloaded node.

Flow Load also shows WIP per person. If one team member has seven items in flight while others have one or two, the work isn't actually flowing to the constraint. It's piling up with one person. That's a distribution problem, but the fix often reveals the constraint: that person is overloaded because they're the only one who can do a specific thing.

The fourth metric, Flow Distribution, breaks down what type of work is flowing (features, bugs, tasks). A team where bugs are crowding out features is often one where quality issues are the hidden constraint: defects from earlier work are consuming the capacity that should be going to new delivery.

GoalPath board view showing items across Not Started, Started, Finished, Delivered, and Accepted columns with team assignments

A practical way to find your constraint

Open the Flow Metrics view and look at the stuck count in Flow Load. If stuck is more than zero, start there. Click through to the items themselves and ask: why is this item still in progress? Who last touched it? What's it waiting for?

Often you'll find a pattern. All the stuck items are waiting for a specific person. Or they're all in the same milestone. Or they all require a specific environment that isn't available. That's your constraint.

Then look at Flow Efficiency. If it's below 15%, the team is spending most of its time waiting rather than doing. The question is: waiting on what? Cross-reference with Flow Load per person to see if the wait is concentrated around a specific bottleneck.

If wait time is high but no single person stands out, look at the Velocity Analytics section for coordination impact. GoalPath calculates a multitasking penalty when the team is spread across many active milestones simultaneously. A 30% velocity reduction from coordination overhead means the constraint might not be a person or a stage. It might be attention, spread too thin across too many parallel tracks of work.

GoalPath unplanned work analysis and triage health showing WIP distribution across milestones

What to do once you find it

The constraint dictates the system's throughput. Your job is first to exploit it: get more out of it without spending more. Then subordinate: don't start work faster than the constraint can absorb it.

If code review is the constraint, that means:

Keep the review queue short, even if it means developers start fewer things
Clear the oldest reviews first, not the newest (FIFO, not LIFO)
Make the reviewer's time protected and uninterrupted
Reduce diff sizes so reviews take less time per item

If architecture sign-off is the constraint, document the decision patterns so not every PR needs senior involvement. Pull the architect upstream to the design stage, not downstream to the review stage.

If staging availability is the constraint, shift work-in-progress limits upstream so fewer things compete for staging slots.

Goldratt's last step, repeat, is the one teams most often skip. When you break one constraint, a new one emerges somewhere else in the system. That's expected and fine. It means the system is improving. The point isn't to find a permanent bottleneck. It's to keep finding the current one and focusing effort there, instead of spreading it everywhere.

The ritual this replaces

Most teams replace "where are we?" with a weekly status call where people report on how busy they feel. GoalPath replaces that call with data. Flow metrics are calculated from actual item movement: when things were started, how long they sat, whether they're still moving. The weekly progress report GoalPath generates draws on the same underlying data.

The mechanism is straightforward: when your team moves work through GoalPath's guided workflow, every status transition is recorded. That data aggregates into the flow metrics, which are recalculated weekly. Instead of asking "is anyone blocked?" in a meeting, you look at the stuck count and the per-person WIP. The constraint is visible before it becomes a crisis.