Counterfactual reasoning

Status: Folded · Evidence: P · Family: Assumption and belief challenge · Verdict: fold (2026-06-09)

Use instead: After Action Review

What it is

Counterfactual reasoning is thinking about alternatives to what is or was the case: “if X had been different, then Y.” The candidate’s gloss - “examine what if X had been different” - is the everyday retrospective sense: you take an event that already happened and mentally re-run it with one antecedent changed, reading off what the changed antecedent would have produced. Kahneman and Tversky named the underlying cognitive operation the simulation heuristic (we judge how an event could have turned out by how easily we can mentally undo it), and Neal Roese’s functional theory organized what these “might-have-been” thoughts do.

The honest description has to separate the durable move from the word, because “counterfactual reasoning” names at least three different operations the catalog treats as different jobs:

Retrospective learning counterfactuals (the dominant, candidate sense): having seen an outcome, ask “what would I have had to do differently to get a better outcome?” - an upward counterfactual - and convert the answer into a lesson for next time. The product is a diagnosis-plus-change: the same object a structured retro emits.
Prospective / prefactual counterfactuals (“what might be if I did X?”): imagine a future action and its alternative outcomes before acting. Pointed at failure (“imagine it failed, what differed?”) this is the premortem move; pointed at the intention-action gap (contrast the wished outcome against the obstacle) it is mental contrasting.
Formal causal counterfactuals (Judea Pearl’s sense): given a structural causal model, compute “what would Y have been had X been x” by abduction (infer the hidden noise from what was observed), action (surgically set X = x), prediction (recompute Y). This is a precise inference algorithm over a model, not a lightweight stance, and it is a different animal from the psychological move the candidate names.

These share only the abstract instruction “vary an antecedent and read off the consequence.” Each one, made concrete, lands on a different artifact - a retro’s sustain/change list, a risk register, a commitment card, a recomputed causal estimate - and, as the verdict section argues, each artifact is already produced by an existing method or is out of the library’s lightweight scope. That split is the central fact about counterfactual reasoning: it is a cognitive stance, broadly useful and deeply studied, that does not itself emit one distinct deliverable.

When it helps / when it misleads

As a stance, counterfactual reasoning helps after a controllable outcome you want to learn from: an upward, self-focused, controllable counterfactual (“if we had load-tested, the launch would not have fallen over”) names a lever you can actually pull next time, which is exactly the input a retro needs. It also helps as a debiasing prompt - forcing yourself to imagine how a confident judgment could have been wrong corrects for overconfidence - and as the imaginative engine behind a premortem.

It misleads or wastes effort when:

It is treated as a method rather than a stance, so it produces a feeling instead of a change. Spontaneous counterfactuals overwhelmingly fixate on uncontrollable antecedents (“if only the market had not turned”) and on self-blame, neither of which yields an action. Left unstructured, the exercise generates regret, not a lesson; the value only appears when an external structure forces the counterfactual to be controllable and converts it into an owned change - which is what a retro supplies.
It tips into rumination. Upward counterfactual thinking is reliably associated with depressive symptoms (a meta-analytic correlation, not a trivial one): the same “if only I had” move that can prepare you can also become repetitive, self-critical brooding. This is a genuine downside the cheerful “always ask what you could have done better” framing hides.
The changed antecedent is not actually informative. Re-running a past decision on an antecedent nobody could have known or controlled at the time (“if only we had foreseen the pandemic”) teaches nothing and invites hindsight bias - judging the past decision by information that did not exist when it was made.
It is pointed at a problem a sharper method already owns. For “imagine it failed,” the disciplined version is premortem; for closing an intention-action gap, it is WOOP; for a real causal estimate, it is a structural causal model, not an armchair “what if.” Reaching for generic counterfactual reasoning in those cases gets you a fuzzier version of a tool the catalog already has.

What the evidence says

The honest grade for the candidate’s stated move - “examine what if X had been different,” used as a deliberate reasoning method - is P (practitioner), and the dossier has to be unusually careful here, because counterfactual thinking is one of the most heavily researched topics in cognitive and social psychology, and it is tempting to borrow that mountain of evidence for a move the mountain did not test.

What the record robustly supports - but about a different claim. There is a large, rigorous, well-replicated literature on counterfactual thinking as a spontaneous mental phenomenon: when it arises (after negative and especially controllable outcomes), which direction it takes (upward after controllable, downward after uncontrollable), and what it does to emotion and blame. Kahneman and Tversky’s simulation heuristic and Roese’s functional theory are foundational and sound. As descriptive cognitive science this is S-grade work. But it describes how counterfactuals occur to people and what they do to feeling and attribution - not whether deliberately running a counterfactual analysis, as a procedure, improves a decision or yields a reusable artifact. Grading the candidate “S” on this basis would be laundering the robustness of the descriptive science onto a prescriptive method it never measured - the exact failure this library exists to prevent.

What the record actually says about the deliberate move - thin and contested. The closest thing to an intervention is Roese (1994, Experiment 3): participants wrote task-specific counterfactuals between two blocks of anagrams, and upward-and-additive counterfactuals produced greater improvement on the second block. That is a real experimental result, but it measures a content-neutral mind-set effect on a puzzle task, not a decision artifact, and it is M-grade for that narrow effect. The generality of even this “preparative function” is directly contested: Briazu, Walsh and colleagues (2016) found people generate very few controllable counterfactuals unless explicitly prompted, and modify uncontrollable features regardless of whether they reflect on success or failure - questioning the dominant view that counterfactuals naturally serve preparation. And the downside is meta-analytic: Broomhall, Phillips and colleagues (2017) pooled 42 effect sizes (N = 13,168) and found upward counterfactual thinking positively associated with depression at r = .26. So the directly relevant evidence for the candidate’s move is one supportive lab-task induction, one study questioning its generality, and a meta-analysis documenting a real harm - which is a P, not an M: a recognized practitioner heuristic with mixed and partly cautionary direct evidence, with the S-grade descriptive science explicitly not counted toward it because it measures occurrence and affect, not the method.

The formal sense does not rescue the grade. Pearl’s structural-causal-model counterfactuals are genuinely rigorous, but they are a different operation (an algorithm over a specified model, not a lightweight stance) and they are out of the catalog’s “apply this in five minutes to a decision” scope. Moreover, the one place the formal sense has been tested on AI agents cuts the other way: CounterBench (Yu and colleagues, 2025) finds large language models perform near chance on formal counterfactual-reasoning questions, and CausalProbe-2024 (Chi and colleagues, 2024) finds models largely stuck at associational (level-1) reasoning. So for an AI-agent library, the formal counterfactual is a capability the agent is bad at, not a validated method to bolt on.

Transfer caveat (required). All of the supportive psychological evidence is from human subjects in lab and field settings; none of it studies counterfactual reasoning, in any of its three senses, performed by or with an AI agent as a decision aid. The evidence is transferred from human contexts and not validated for AI-augmented use - and the only direct AI-context evidence (CounterBench, CausalProbe-2024) is about model capability and is negative.

Excluded under the evidence rule. The popular productivity framing that “counterfactual / ‘what if’ thinking improves decisions” carries no primary source measuring decision quality; the genuine preparative-function result traces to Roese (1994)‘s anagram task (a content-neutral performance effect), not to any decision-outcome study, and any unsourced “improves decisions by N%” figure is excluded and does not move the grade.

Why it is / is not a skill here

Verdict: Fold into after-action-review. This overturns the catalog’s prior cand / build / P tag (“examine what if X had been different … clears the bar but lower priority”); the concrete reason follows, and it is the same structural reason the library used to fold inversion into premortem.

The Build burden is to name one distinct, durable cognitive move that no shipped skill produces, and to show no existing skill (or chain of skills) already produces it. Counterfactual reasoning fails that burden because, exactly like inversion, it is a stance, not a procedure with its own artifact - and each concrete thing the stance produces is already owned:

The dominant retrospective-learning reading is the after-action-review engine. “After an outcome, ask what you would have had to do differently to get a better result, and turn that into a change” and AAR’s “compare what was expected to what actually happened, diagnose why the gaps occurred in both directions, and convert that into what to sustain and what to change” are the same move at the same target producing the same artifact-class. AAR’s “diagnose the why of each gap” step is structured upward counterfactual reasoning applied to a finished event - it just adds the recorded-expectation baseline (so the counterfactual is anchored, not hindsight-driven), the blameless discipline (which counters the self-blame failure mode the evidence warns about), and the conversion into owned sustain/change actions (which counters the produces-a-feeling-not-an-action failure mode). This is well above the ~20% overlap ceiling: deliberate retrospective counterfactual reasoning is after-action-review minus the structure that makes it safe and actionable. The closest shipped skill is therefore after-action-review, and it subsumes the candidate’s headline use. (after-action-review is status: shipped, so the fold target resolves.)
The prospective readings are already owned, so they cannot rescue a standalone skill. “Imagine the future action failed and read off what differed” is premortem (which already absorbed inversion’s failure-imagination move); “contrast the wished future outcome against the obstacle and bind an if-then” is woop (mental contrasting). The catalog has already carved these prospective flavors into dedicated skills; counterfactual reasoning adds no new prospective move.
The debiasing reading is consider-the-opposite, which lives in red-team-light. “Force yourself to imagine how the judgment could have been wrong” is, mechanically, constructing the strongest counter-case - the core of red-team-light (built on steelmanning). The inversion dossier already located this same operation there. Not a separable new skill.
The formal causal reading is out of scope and, on agents, unreliable. Pearl’s SCM counterfactual is an algorithm over a specified model, not a lightweight thinking move a skill applies to a prose decision; and the AI-context evidence shows agents perform near chance on it. It is neither a fold target nor a buildable lightweight skill here.

So there is no separable artifact that is uniquely “counterfactual reasoning.” Splitting it shows that every instantiation duplicates a shipped move, with the dominant (retrospective-learning) instantiation duplicating after-action-review most directly. That is a fold, not a build. Fold it into after-action-review as the canonical retrospective home for the counterfactual stance, and let the dossier record that the prospective flavor lives in premortem / woop and the debiasing flavor in red-team-light.

Why fold rather than recipe or reject: it is not a clean fixed chain (it is one stance that maps onto one existing move depending on whether it points backward, forward, or at a judgment, not a sequence like first-principles), so it is not a recipe. And reject would be less informative than fold - the move is real, deeply studied, and worth locating, so the honest service is to point the reader to where it already lives, exactly as the library did when it folded inversion into premortem and steelmanning into red-team-light. The learning value of the NO: a famous, genuinely researched cognitive phenomenon is not automatically a skill. Counterfactual reasoning is a way the mind re-runs reality; a library that ships artifacts, not stances, documents it and folds it rather than shipping a fuzzier, riskier after-action-review under a more famous name.

Lineage and who to read

The cognitive operation behind counterfactual reasoning was named by Daniel Kahneman and Amos Tversky as the simulation heuristic (lecture 1979; published as a chapter in Judgment under Uncertainty: Heuristics and Biases, 1982), with their norm theory (Kahneman and Dale Miller, 1986) explaining why some alternatives come to mind more easily and drive regret. The modern functional account - what these “might-have-been” thoughts are for - is Neal Roese’s: read his 1994 Journal of Personality and Social Psychology paper “The Functional Basis of Counterfactual Thinking” and the Kai Epstude and Neal Roese 2008 review “The Functional Theory of Counterfactual Thinking” (Personality and Social Psychology Review). For the honest other side - that the preparative function is weaker and more conditional than the slogan - read Raluca Briazu, Clare Walsh and colleagues, “Questioning the Preparatory Function of Counterfactual Thinking” (Memory & Cognition, 2016). For the cautionary downside, read Tom Broomhall, Wendy Phillips and colleagues, “Upward Counterfactual Thinking and Depression: A Meta-analysis” (New Ideas in Psychology, 2017). For the entirely separate formal sense - counterfactuals as computations over a structural causal model - read Judea Pearl, Causality (2009) and The Book of Why (Pearl and Dana Mackenzie, 2018). For where the formal move stands when an AI agent attempts it, read the CounterBench (2025) and CausalProbe-2024 evaluations. “Counterfactual reasoning” is a generic descriptive term in common scholarly use - no trademark, no attribution required beyond crediting Kahneman, Tversky, Roese, and (for the formal sense) Pearl - so this entry is documented descriptively and is not flagged as branded.

Named sources

Daniel Kahneman and Amos Tversky, “The Simulation Heuristic,” in Judgment under Uncertainty: Heuristics and Biases (eds. Kahneman, Slovic, Tversky), 1982, pp. 201-208. Foundational: introduced counterfactual mental simulation (judging an outcome by how easily it can be undone) and its link to regret. Robust descriptive cognitive science of the phenomenon - NOT a test of counterfactual reasoning as a deliberate decision method. (S as descriptive science; does not transfer to the candidate’s prescriptive move)
Neal J. Roese, “The Functional Basis of Counterfactual Thinking,” Journal of Personality and Social Psychology 66(5) (1994): 805-818. Experimental: across three studies, upward-and-additive counterfactuals (vs downward-and-subtractive) engendered greater improvement on a subsequent anagram task (Experiment 3 used a between-blocks counterfactual induction). The nearest direct evidence for a “preparative” benefit, but it measures a content-neutral mind-set effect on a puzzle, not decision quality or a reusable artifact. (M, for that narrow effect)
Kai Epstude and Neal J. Roese, “The Functional Theory of Counterfactual Thinking,” Personality and Social Psychology Review 12(2) (2008): 168-192. Canonical review framing counterfactuals as behavior regulation via content-specific and content-neutral pathways. Theory synthesis of the human phenomenon. (M/P)
Raluca A. Briazu, Clare R. Walsh, Catherine Deeprose and Giorgio Ganis, “Undoing the Past: Questioning the Preparatory Function of Counterfactual Thinking,” Memory & Cognition 44(7) (2016): 1098-1110. Experimental: people generate few controllable counterfactuals unless explicitly prompted and modify uncontrollable features regardless of reflecting on success or failure - directly questioning the generality of the preparative function. Counter-evidence to the deliberate-method case. (M)
Tom J. Broomhall, Wendy J. Phillips, Donald W. Hine and Natasha M. Loi, “Upward Counterfactual Thinking and Depression: A Meta-analysis,” New Ideas in Psychology 46 (2017): 12-23. Meta-analysis: 42 effect sizes, pooled N = 13,168, upward counterfactual thinking associated with depression at r = .26. The maladaptive-rumination downside - load-bearing for the “when it misleads” wall. (M, for the harm association)
Judea Pearl, Causality: Models, Reasoning, and Inference (2nd ed., Cambridge University Press, 2009); and Judea Pearl and Dana Mackenzie, The Book of Why (2018). The formal counterfactual: abduction-action-prediction over a structural causal model, level 3 of the causal hierarchy. Rigorous but a different operation (algorithm over a model) and out of the catalog’s lightweight scope. (foundational, formal; not the candidate’s psychological move)
Yujia Zheng / Wei Yu et al., “CounterBench: A Benchmark for Counterfactual Reasoning in Large Language Models,” arXiv:2502.11008 (2025); and CausalProbe-2024 (Chi et al., NeurIPS 2024). Direct AI-context evidence on the formal sense: LLMs perform near chance on counterfactual questions and remain largely at associational reasoning. The only agent-context data point, and it is negative on capability. (the agent-context check, not support for the method)

Excluded under the evidence rule: the popular “‘what if’ / counterfactual thinking improves decisions by N%” framing has no primary source measuring decision quality; the genuine preparative-function result traces to Roese (1994)‘s anagram-performance task, not to any decision-outcome study, and is not counted toward this entry as evidence of better decisions.

Was this page helpful?

Thinking Framework Skills v0.8.0 · 56 frameworks