The evidence behind thinking tools
For the skeptic who assumes “thinking tools” means recycled consultant decks and laundered statistics. The bet of this library is the opposite: every empirical claim traces to a graded source, the strong-evidence core is named, and where the evidence is thinner than people assume, the page says so. This path is for tracing the grading, not for running a method. Bring suspicion; follow each link to its grounding.
The path
Section titled “The path”- Start at the grading model. Read The evidence model for the seven tiers (S/M/P/V/A/C/X) and the rule that a “P, useful anyway, here is when not to use it” beats a dressed-up “S”. The tier grades the framework’s mechanism, not a vendor’s confidence.
- Go to the aggregated bibliography. The bibliography collects the graded sources across the library so any claim can be walked back to its primary source. No effect size appears unless a source records it.
- Inspect the strong-evidence core. The S-tier skills are the anchor: replicated experimental or meta-analytic support, not practitioner lore. Each one carries its own dossier with sources and limits: Brainwriting, Far-Analogy Ideation, Stocks and Flows Reasoning, Authentic Dissent, Argument Mapping, Natural-Frequency Bayesian Framing, Linear-Model Aggregation, Reference Class Forecasting, WOOP (Mental Contrasting with Implementation Intentions), and After Action Review.
- Read the honest part. A strong tier does not mean an unlimited claim. The caveats below come straight from those dossiers.
The honest part
Section titled “The honest part”The interesting thing about an S grade is what it does not cover. These limits are pulled from the dossiers, not invented.
- Argument Mapping: the large measured gains in critical-thinking skill are for sustained practice, typically a course-length program building the skill. A single one-shot map does not carry the course-length effect, and a tidy map shows valid structure, not true premises.
- Authentic Dissent: genuine minority dissent improves a group’s reasoning, but role-played or assigned devil’s advocacy does not replicate it. An AI cannot supply the authentic dissent the evidence is about; the skill engineers the conditions for real dissent rather than pretending to be the dissenter.
- WOOP: the strong evidence is for closing the intention-action gap, and it is conditional. Positive fantasizing about the outcome alone, skipping the obstacle and the if-then plan, measurably reduces follow-through. Skip the hard steps and you invert the effect.
- Stocks and Flows Reasoning: the strong result is that a specific accumulation error is real and widespread, and that making the stock-flow structure explicit removes it on a given problem. It is not a claim that the skill teaches general systems thinking, where the pedagogy evidence is mixed.
A related caveat lives outside the S-tier and is worth naming because it is the most common overclaim in this whole field: the premortem reliably surfaces more and more-specific risks, which is well supported, but it is not proven to improve final outcomes. The library grades the mechanism that is demonstrated and refuses the one that is not.