Audit your reasoning

For anyone who has to trust a conclusion before acting on it: a recommendation, a contested call, a confident piece of model output. Bring one real argument or claim. By the end you will have separated what is known from what is inferred, exposed the leaps, laid the logic bare, and pressure-tested it. Each page is self-contained, so you can follow this by hand or hand the steps to an agent.

The path

Evidence vs Inference Sort - Start here because fluent text blends what is observed with what is deduced, and models present inference in the same confident register as fact. Sort each claim into evidence, inference, or assumption, record the basis, rate each inference’s confidence, and flag anything uncited. The artifact is an evidence/inference ledger ending in a short list of load-bearing unknowns. Note the boundary: this sorts claim type, it does not verify that the evidence is true. That ledger tells you which inferences are worth climbing into next.
Ladder of Inference Check - Take a conclusion that the ledger marked as inference and feels too certain, and slow the climb back down. Reconstruct the rungs from the observable data, to the data actually selected, to the meaning and assumptions added, then flag the single riskiest leap and test one credible alternative reading of the same data. The artifact is an annotated reasoning trace. Where the sort labeled claims, this exposes how the selected ones became a conclusion, and surfaces what an alternative interpretation would imply.
Argument Mapping - Now lay the whole case out as a structure. Map the contention, the reasons that support it, the unstated co-premises each reason silently needs, and the objections and rebuttals, then flag the weak links and load-bearing premises that are unsupported. The artifact is an argument map. The prior steps fed it the assumptions and risky leaps to make explicit; the map shows whether the structure holds. Boundary: a valid structure does not make the premises true.
Red Team Light - With the structure visible, attack it. Build the strongest objections an intelligent adversary would raise (steelman, not strawman), rank them by force, and judge which are decisive and which are survivable. The artifact is an adversarial critique aimed at the weak links the map exposed. Honest limit: this is constructed, role-played dissent, so for high stakes it flags whether a real dissenting view should be sought, not just the model’s.
Authentic Dissent - Use this instead of (or after) the red team when the stakes are high and other people are involved. A model cannot be the dissent, because role-played devil’s advocacy gets discounted as performance and does not deliver the reasoning benefit that genuine minority dissent does. So this does not argue against the plan: it audits whether real dissent exists, labels any constructed dissent as constructed, and plans how to elicit and protect a genuine dissenter. The artifact is a dissent audit and plan.
Natural-Frequency Bayesian Framing - Reach for this step whenever a claim turns on a conditional probability or a base rate: a test result, a screening signal, a “given a positive, what is the real chance” question. Re-express the same facts as natural frequencies over a concrete population to compute the correct posterior and expose base-rate neglect. The artifact is a natural-frequency breakdown. It requires real input rates: the format makes correct reasoning tractable, it does not invent the numbers.

What you will be able to do

You will be able to take a confident conclusion apart and put it back together honestly: the known separated from the inferred, the risky leaps named, the logical structure laid bare with its weak links flagged, and the strongest case against it on the table. You will also know the difference between the model role-playing an objection and a real human holding a contrary view, and when a number in the argument needs the base rate dragged back into view. None of these moves proves a conclusion true. What they buy you is knowing exactly where a conclusion is load-bearing and unsupported, before you bet on it.

Thinking Framework Skills v0.3.0 · 38 frameworks