Reading your forecast: what the probability bands actually mean

Someone asks "when will this ship?" and you feel the familiar pull to just give them a date. Pick something that sounds reasonable. A date that won't cause immediate alarm. You know in the back of your head it's optimistic, but the alternative (explaining uncertainty, talking about velocity variance, asking them to hold two dates in their head at once) feels harder than it's worth.
So you say "end of the quarter" and hope for the best.
Most teams I've talked to have been doing some version of this for years. Not because they're dishonest, but because the tooling trained them to think in single dates. You open your project management tool, there's a due date field, you put a date in it. Forecast delivered.
The problem is that a single date is not a forecast. It's a guess dressed up as a commitment.
What actually drives delivery time
Before getting into what the numbers mean, it's worth being precise about what determines when something ships.
The short version: remaining work divided by the rate at which the team completes work. That's it. The math is not complicated.
The hard part is that both inputs have variance. Remaining work changes. Scope creep is not an exception, it's the norm on most software projects. And team velocity fluctuates week to week. Some weeks the team ships 12 points, some weeks they ship 6. Bugs, sick days, an unexpected architecture decision, a dependency that wasn't ready. Real teams are not machines that run at constant throughput.
When you divide a changing numerator by a variable denominator, you don't get a date. You get a distribution. Some outcomes cluster around a central value, some skew later, occasionally things go faster than expected. Pretending otherwise doesn't make the uncertainty go away. It just hides it until the date passes.
Research on the planning fallacy (Kahneman & Tversky) confirms that humans systematically underestimate how long things will take, even when they have prior experience with similar tasks. For software projects specifically, research suggests underestimates often run 40–60%. This is not a failure of individual competence. It's how human cognition works. You focus on the task ahead, not on everything that went sideways last time.
A good forecast system doesn't fight human nature by asking people to estimate better. It uses actual delivery history to build the range automatically.
The three scenarios GoalPath shows you
When you open the forecast panel on a milestone in GoalPath, you see three completion dates, not one. Optimistic, Most Likely, and Pessimistic. Each one represents a different assumption about how the remaining work will go.

Optimistic is not "best case if everything goes perfectly." It's best case based on the upper end of your actual velocity history. GoalPath calculates this using your team's velocity at the upper range of its recent performance: the kind of output you've actually hit in good weeks, not a projection of what you could theoretically do. It is an achievable outcome, not a fantasy.
Most Likely is the central estimate, built from your average (effective) velocity adjusted for real-world factors: how many milestones the team is running in parallel (multitasking reduces effective velocity), the historical rate of scope additions to this type of work, and how many items in the milestone still lack estimates. This is the number you'd bet on if you had to pick one.
Pessimistic is not "what if everything goes wrong." It's what happens when the team runs at the lower end of its demonstrated velocity, accounting for the same adjustment factors. A realistic bad-week scenario, not a catastrophe.
The width of the range tells you something important: how predictable this milestone is. A forecast that shows Optimistic on March 12 and Pessimistic on March 13 means the remaining work is small and well-understood. A forecast showing March 12 and April 28 means there's real uncertainty: too many unestimated items, high velocity variance, or both. The range is not a bug in the display. It's the most useful information in it.
GoalPath shows you the calculation behind these numbers if you want to dig in. Click "Calculation Details" and you can see the effective velocity (post-penalty), the standard deviation, the multitasking penalty if one applies, the reactive planning buffer derived from your historical scope additions, and the exact formula used. This is not magic. It's arithmetic applied consistently to your actual delivery data.
The delivery probability lines in your item list
Inside a milestone's item list, you'll notice thin colored horizontal lines appearing between items. These are delivery probability lines, and they're the most useful tool in GoalPath for having scope conversations.

The concept is straightforward. GoalPath takes your three velocity scenarios and asks: if the team works through this list in priority order, at optimistic/probable/pessimistic velocity, which item will be the last one delivered by the target week?
The green line is Best Case. Items above it will likely ship by the optimistic date.
The amber line is Probable. Items above it will likely ship if the team performs at its expected rate.
The red line is Worst Case. Items above it will likely ship even if things go slower than usual.
Items that fall below the red line have a meaningful probability of not shipping in the current cycle at all.
This changes the conversation with stakeholders. Instead of "we think we'll ship all of this," you can point to the screen and say: "these six items are above the amber line, which means they'll ship on the expected scenario. These three are between amber and red, which means they'll ship if things go reasonably well but might slip. These two below the red line are uncertain and we should talk about whether to cut them or push the timeline."
That conversation takes about two minutes. The alternative, where you guess at a date and then explain a slip two weeks before delivery, takes much longer and damages trust in a way that's hard to recover from.
What confidence level means
The confidence badge in the forecast panel (High / Medium / Low) reflects how variable your team's velocity has been over the measurement window. It tells you how much to trust the forecast range. It doesn't add a separate buffer on top of it.
High confidence means velocity has been consistent: the standard deviation is low relative to the average (under 50%). The gap between optimistic and pessimistic dates will be narrow, and the range is fairly reliable.
Medium confidence means moderate fluctuation. Velocity variance is in the 50–70% range. You'll see a wider spread between the three scenarios, and there's more room for surprise in either direction.
Low confidence means high variance, or that GoalPath fell back to global or historical velocity data instead of team-specific data. When the system doesn't have enough dedicated team velocity (fewer than 5 recent completed items from a dedicated team), it falls back to project-wide or historical averages, which caps confidence at Medium or Low regardless of variance. The uncertainty factors in the Calculation Details panel will tell you exactly why.
When you see a low-confidence forecast, the answer is not to distrust GoalPath. The answer is to ask why the variance is high. Usually it's one of two things: items in the milestone lack estimates (easy fix: add them), or the team is spread too thin across parallel work (harder fix, but worth addressing). The confidence badge is a diagnostic, not just a label.
How to use this in stakeholder conversations
The ritual GoalPath replaces here is the "when will this ship?" conversation where you improvise a date, stakeholders write it down, and everyone moves on hoping for the best. That ritual generates false precision and erodes trust when reality doesn't match the improvised date.
The replacement is not more complex. Pull up the milestone forecast panel or the item list and walk through the three scenarios together. It takes about two minutes. Something like:
"Our expected delivery is [Most Likely date], assuming normal velocity. Best case it could be as early as [Optimistic date]. If we hit the kind of delays we sometimes see, such as scope creep or a tricky dependency, it might push to [Pessimistic date]. The high-priority items are expected to ship even in the pessimistic scenario. These lower-priority ones are at risk if things go slowly. Do you want to talk about scope, or should we plan around the pessimistic date?"
Notice what this does: it replaces "we'll ship on X" with "here's the range and here's what determines where we land." It invites stakeholders into a trade-off conversation rather than a date negotiation. It makes scope the variable, not your credibility.
Most stakeholders respond well to this once you do it a few times. The initial discomfort ("why can't you just give me a date?") is mostly habit. When they see that the range comes from real data and that the tool updates automatically as the team delivers, the trust actually goes up, not down.
The key is to be consistent. Use the range every time. Don't give single dates in Slack and then pull out the range only when things are going badly. If the range is your communication standard from the beginning, it becomes the shared reality the team and stakeholders work from.
When forecasts should change your plan, not your date
One thing teams sometimes miss: GoalPath recalculates forecasts whenever something significant changes. Items completed, new items added, velocity changes over the measurement window. The forecast is not set at the start of a milestone and then compared to reality at the end. It's a live reading of where you stand.
If you check the forecast today and it says the pessimistic date is after a commitment you've made to a customer, that's not a forecast failure. That's the forecast working exactly as intended. Not knowing is worse.
The right response to a forecast that shows a date you don't like is not to question the forecast. It's to look at the item list, find what's above and below the probability lines, and decide whether to cut scope, add capacity, or have a timeline conversation now rather than in three weeks when there's no room to maneuver.

The math is not the hard part
I want to be honest about where this actually gets difficult. It's not the math. GoalPath handles the math. The hard part is the organizational habit change.
Teams that have been doing date-picking for years will initially feel that probability bands are evasive, even when they're more honest. Some stakeholders will push back. Some PMs will feel exposed when the range is wide because it makes visible the uncertainty they were previously hiding with a single number.
The discomfort passes. What doesn't pass is the slow erosion of credibility from repeatedly promising dates you can't keep. Three-point forecasts, used consistently, build a different relationship with stakeholders over time. One where they trust the numbers because the numbers have been right about the range, even when they weren't right about the single date.
GoalPath is opinionated about this. There's no single-date forecast, no due date field you fill in manually. The forecast comes from the data, not from what you think will make a stakeholder happy this week. That's a choice, and it reflects a belief that honest uncertainty is more useful than false precision.
A range is not a hedge. It's the most accurate thing you can say.
Further reading
- Using Monte Carlo Simulations to Predict Delivery Timelines: Agile Seekers
- Forecasting software project completion dates through Monte Carlo simulation: Towards Data Science
- Probabilistic forecasting in agile: Agile Digest
- Planning fallacy: Wikipedia (good summary of the research lineage from Kahneman/Tversky forward)
Ready to plan your roadmap with data?
Create an account on GoalPath and start tracking velocity, forecasting milestones, and delivering predictably.
Create an Account