The evidence behind thinking tools

For the skeptic who assumes “thinking tools” means recycled consultant decks and laundered statistics. The bet of this library is the opposite: every empirical claim traces to a graded source, the strong-evidence core is named, and where the evidence is thinner than people assume, the page says so. This path is for tracing the grading, not for running a method. Bring suspicion; follow each link to its grounding.

The path

Start at the grading model. Read The evidence model for the seven tiers (S/M/P/V/A/C/X) and the rule that a “P, useful anyway, here is when not to use it” beats a dressed-up “S”. The tier grades the framework’s mechanism, not a vendor’s confidence.
Go to the aggregated bibliography. The bibliography collects the graded sources across the library so any claim can be walked back to its primary source. No effect size appears unless a source records it.
Inspect the strong-evidence core. The S-tier skills are the anchor: replicated experimental or meta-analytic support, not practitioner lore. Each one carries its own dossier with sources and limits: Brainwriting, Far-Analogy Ideation, Stocks and Flows Reasoning, Authentic Dissent, Argument Mapping, Natural-Frequency Bayesian Framing, Linear-Model Aggregation, Reference Class Forecasting, WOOP (Mental Contrasting with Implementation Intentions), and After Action Review.
Read the honest part. A strong tier does not mean an unlimited claim. The caveats below come straight from those dossiers.

The honest part

The interesting thing about an S grade is what it does not cover. These limits are pulled from the dossiers, not invented.

Argument Mapping: the large measured gains in critical-thinking skill are for sustained practice, typically a course-length program building the skill. A single one-shot map does not carry the course-length effect, and a tidy map shows valid structure, not true premises.
Authentic Dissent: genuine minority dissent improves a group’s reasoning, but role-played or assigned devil’s advocacy does not replicate it. An AI cannot supply the authentic dissent the evidence is about; the skill engineers the conditions for real dissent rather than pretending to be the dissenter.
WOOP: the strong evidence is for closing the intention-action gap, and it is conditional. Positive fantasizing about the outcome alone, skipping the obstacle and the if-then plan, measurably reduces follow-through. Skip the hard steps and you invert the effect.
Stocks and Flows Reasoning: the strong result is that a specific accumulation error is real and widespread, and that making the stock-flow structure explicit removes it on a given problem. It is not a claim that the skill teaches general systems thinking, where the pedagogy evidence is mixed.

A related caveat lives outside the S-tier and is worth naming because it is the most common overclaim in this whole field: the premortem reliably surfaces more and more-specific risks, which is well supported, but it is not proven to improve final outcomes. The library grades the mechanism that is demonstrated and refuses the one that is not.

Thinking Framework Skills v0.3.0 · 38 frameworks