Key Assumptions Check / Assumption Mapping

Status: Documented, not shipped · Evidence: P · Family: Assumption and belief challenge · Verdict: reject (2026-06-03)

What it is

Key Assumptions Check (KAC) is an assumption-ledger move: before trusting an analysis, plan, or bet, make a complete inventory of the things it silently takes for granted, then go down the list and rate how well each one actually holds. The durable cognitive operation is “surface the implicit, then sort it by load and confidence.” In the canonical intelligence-analysis form, the team lists every working supposition it can think of, then tags each one supported, supported-with-caveats, or unsupported; the unsupported ones are not discarded but reclassified as key uncertainties that drive further collection or research. The deliverable is a ranked register of assumptions, with the most load-bearing and least-supported ones flagged for testing first.

The same move travels under a second brand, “Assumption Mapping,” from the lean-startup and design-sprint world. There the assumptions behind a business idea are made explicit and plotted on an Importance by Evidence 2x2; the top-right quadrant (high importance, little evidence) is where the riskiest assumptions live and where experiments should run. The framing question used to generate the list is, verbatim, “What are all the things that need to be true for this idea to work?”

So under two different names the operation is one thing: turn a confident conclusion back into the conditions it rests on, then prioritize those conditions by how much weight they carry and how shaky they are. That priority structure - load-bearing crossed with uncertain - is the same importance-by-certainty plot that Mason and Mitroff formalized in 1981, and it is the same step the catalog already ships. The brand packaging differs (a three-way supported/caveated/unsupported tag in the intelligence version, a 2x2 grid in the lean version); the underlying ledger is identical.

When it helps / when it misleads

It helps wherever a plan or judgment has quietly hardened around premises nobody has stated out loud: a strategy resting on an unexamined demand assumption, an intelligence estimate carrying an inherited belief about an adversary, a launch plan that assumes a channel or a price point will hold. Forcing the implicit to become explicit, and then sorting the explicit by load and confidence, is a real discipline against the mind’s tendency to treat its own background beliefs as facts. It pairs naturally with a downstream step that actually tests the flagged assumptions rather than just listing them.

It misleads or wastes effort when:

The ledger is mistaken for the test. Listing and rating an assumption is not validating it. The most common failure mode is a tidy register that names the risky assumptions, then stops - the work that changes the decision is running the cheap test on the killer assumption, which the inventory itself does not do.
The rating is treated as measurement. “Supported / unsupported,” or a dot’s position on an Importance by Evidence grid, is a judgment call, not a calibrated probability. Dressing it as a quadrant can manufacture false precision - the grid looks quantitative but the placements are the same fallible analysts’ opinions that produced the assumptions.
It is run as ceremony on a reversible decision. For a cheap, easily-undone choice the assumption-ledger ritual costs more than the decision is worth.
It is pointed at a problem a sharper shipped method already owns. If the job is to reverse-engineer one option into the specific conditions under which it would be the best choice and name the one or two to test first, that is exactly what-would-have-to-be-true. If the job is to reconstruct how a conclusion was built from data, that is the ladder-of-inference-check. Reaching for a generically-named “list your assumptions” exercise in those cases yields a flatter version of a tool the catalog already has.

What the evidence says

The honest governing grade is P (practitioner). KAC and Assumption Mapping are real, named, widely-taught methods with clear provenance, but neither has controlled evidence that performing the move produces better decisions than not.

What the record supports. The intelligence-community version is one of the core Structured Analytic Techniques (SATs) codified by Heuer and Pherson, taught across the U.S. Intelligence Community and adapted into other fields. The lean version is documented in Bland and Osterwalder’s Testing Business Ideas (2019) and is in wide practitioner use, including inside Google’s Design Sprint. Adoption and face-validity are genuine. That is the extent of the directly supported claim: respectable, long-lived practitioner methods for surfacing and prioritizing assumptions.

What the record does NOT support, and the laundering trap. There is no controlled or comparative study isolating the Key Assumptions Check itself and showing it improves analytic accuracy or decision outcomes. The most-cited empirical anchor is the RAND pilot study led by Stephen Coulthart (RR1408, 2016), which found that intelligence publications using SATs generally addressed a broader range of potential outcomes and implications than analyses that did not. Read precisely, that is a finding about coverage and breadth, not about accuracy or hit-rate, and it is a small pilot, not a controlled trial; the same body of work notes that the Intelligence Community does not systematically evaluate SAT effectiveness. Coulthart’s separate evidence-based review of twelve core SATs (2017) reports positive but qualified signals and explicitly flags how thin the rigorous base is. None of this lifts the move above P, and treating “SATs broadened the outcome set” as “KAC makes you right more often” would be exactly the transferred-evidence laundering this library exists to prevent.

Excluded figure (required). Pherson’s writing repeats that “about one in four assumptions collapses upon careful examination.” That is a practitioner observation stated on Pherson’s own materials, not a controlled measurement with a traceable study behind it; under this library’s evidence rule it is recorded as an attributed claim, not counted as evidence, and it does not influence the grade. Any unattributed “Assumption Mapping improves success by N%” framing is likewise excluded - no such sourced quantity exists in this literature.

Transfer caveat (required). All of the adjacent evidence comes from human analysts and human teams in intelligence and business settings. None of it studies a Key Assumptions Check or an Assumption Map produced by or with an AI agent. The evidence is transferred from human contexts and is not validated for AI-augmented use; the conservative governing grade is therefore P.

Why it is / is not a skill here

Verdict: Reject (excluded), folded in substance into the shipped what-would-have-to-be-true. This overturns a brief earlier [next] proposal to build it as a standalone skill; the concrete reason for the overturn is overlap, not lack of merit.

The Build burden is to name one distinct, durable cognitive move that no shipped skill already produces, above the roughly 20% overlap ceiling. KAC fails that burden because its move - inventory the implicit assumptions, mark which are load-bearing, rate confidence, and surface the riskiest for testing - is the same assumption-ledger that what-would-have-to-be-true (WWHTBT) already produces. WWHTBT’s own instructions enumerate the conditions a choice rests on, mark which are load-bearing (the choice fails if false), rate current confidence high/medium/low, say how each could be tested, and name the “killer conditions” that are both most load-bearing and least certain. That is KAC’s surface-and-rate step and Assumption Mapping’s Importance by Evidence prioritization, step for step. The shared machinery is far above the ceiling; the registry records the relationship directly, noting that WWHTBT “absorbs Key Assumptions Check,” because both reduce to its killer/testable conditions.

The two brands’ only distinguishing assets do not survive inspection as separable mechanisms. The intelligence version’s three-way tag (supported / supported-with-caveats / unsupported) is a relabeling of WWHTBT’s confidence rating. The lean version’s Importance by Evidence 2x2 is the importance-by-certainty plot that Mason and Mitroff formalized in 1981 - which the registry already identifies as the canonical prior-art structure for WWHTBT’s prioritization step, the prose form of “rank conditions by load-bearing times uncertain, test the killer few.” A preset rating scheme or a 2x2 rendering of an existing skill’s prioritization is a configuration of that skill, not a new move.

Why reject rather than ship a thin variant. WWHTBT carries one extra discipline that makes it the better home for this move: it forces the framing into the conditional (“what would have to be true,” not “what we assume is true”), which depersonalizes a hardened disagreement and separates would-have-to-be-true from is-true. A bare “list your assumptions” exercise can collapse into asserting the assumptions as facts and arguing about them - the very trap WWHTBT’s conditional framing avoids. So the library ships the stronger framing once and documents the famous brands here rather than shipping a second, weaker assumption-ledger under two more recognizable names. The learning value of the NO: fame under two brands (a core intelligence SAT, a staple lean-startup canvas) does not earn a standalone skill when the underlying move is already shipped, better-framed, as what-would-have-to-be-true. KAC and Assumption Mapping are where the reader meets that move under its famous names; the capability they want lives in WWHTBT.

Lineage and who to read

The move has two independent modern lineages and an older common ancestor.

The intelligence-analysis line runs through Richards J. Heuer Jr. (1927-2018), the CIA analyst whose Psychology of Intelligence Analysis (1999) argued that analytic error is largely mindset error, and Randolph H. Pherson, who together codified the Key Assumptions Check as one of the core Structured Analytic Techniques in Structured Analytic Techniques for Intelligence Analysis (CQ Press, 2010; 2nd ed. 2014). For the honest empirical record, read the RAND pilot evaluation led by Stephen Coulthart (RR1408, 2016) and Coulthart’s 2017 review of twelve SATs - both find coverage-and-breadth signals, not accuracy gains, and both note the absence of systematic effectiveness evaluation.

The lean-startup / design line runs through David J. Bland and Alexander Osterwalder, Testing Business Ideas: A Field Guide for Rapid Experimentation (Wiley, 2019), which packages Assumptions Mapping as a team exercise plotting desirability/feasibility/viability hypotheses on an Importance by Evidence 2x2; Strategyzer’s materials generate the list with the question “What are all the things that need to be true for this idea to work?”

Both lines descend from Richard O. Mason and Ian I. Mitroff, Challenging Strategic Planning Assumptions: Theory, Cases, and Techniques (Wiley, 1981) - Strategic Assumption Surfacing and Testing (SAST), with Jim Emshoff - which formalized surfacing a plan’s assumptions and rating them on importance and certainty. “Key Assumptions Check,” “Assumption Mapping,” and “Assumptions Map” are generic descriptive method names in common use - no trademark, attribution required only to the originators - so this entry is documented descriptively and is not flagged as branded.

Named sources

Richards J. Heuer Jr. and Randolph H. Pherson, Structured Analytic Techniques for Intelligence Analysis (CQ Press, 2010; 2nd ed. 2014). Codifies the Key Assumptions Check as a core SAT: list working assumptions, then rate each supported / supported-with-caveats / unsupported, converting the unsupported ones into key uncertainties. Foundational / practitioner. (P)
Stephen Coulthart et al., Assessing the Value of Structured Analytic Techniques in the U.S. Intelligence Community (RAND RR1408, 2016). Pilot study: SAT-using publications addressed a broader range of outcomes and implications; the IC does not systematically evaluate SAT effectiveness. A coverage/breadth finding, not an accuracy finding; small pilot. (P, pilot)
Stephen J. Coulthart, “An Evidence-Based Evaluation of 12 Core Structured Analytic Techniques,” International Journal of Intelligence and CounterIntelligence (2017). Reviews the evidence for the core SATs; positive but qualified, and explicit that the rigorous base is thin. (P, review)
David J. Bland and Alexander Osterwalder, Testing Business Ideas: A Field Guide for Rapid Experimentation (Wiley, 2019). Packages Assumptions Mapping with the Importance by Evidence 2x2 and the framing question “What are all the things that need to be true for this idea to work?” Practitioner / foundational for the lean lineage. (P)
Richard O. Mason and Ian I. Mitroff, Challenging Strategic Planning Assumptions: Theory, Cases, and Techniques (Wiley, 1981). Strategic Assumption Surfacing and Testing (SAST); the canonical importance-by-certainty prior art for the prioritization step both brands and WWHTBT share. Foundational / practitioner. (P)

Excluded under the evidence rule: Pherson’s repeated “about one in four assumptions collapses upon careful examination” is an attributed practitioner observation with no traceable controlled source and is not counted toward the grade; any unattributed “Assumption Mapping improves outcomes by N%” framing is likewise excluded. The only sourced signals in this literature are the SAT coverage-and-breadth findings (Coulthart / RAND), which measure breadth, not accuracy.

Was this page helpful?

Thinking Framework Skills v0.8.0 · 56 frameworks