FMEA-lite

Status: Folded · Evidence: P · Family: Risk, failure, and resilience · Verdict: fold (2026-06-09)

Use instead: Premortem

What it is

Failure Mode and Effects Analysis (FMEA) takes a plan, product, or process, enumerates the ways each part can fail (the failure modes), and for every mode scores three things on bounded scales: how bad the consequence is if it happens (Severity), how likely it is to happen (Occurrence), and how likely current controls are to catch it before it reaches the user (Detection). In the classic form the three scores are multiplied into a Risk Priority Number (RPN = S x O x D), and the team works the highest-RPN modes first, adding mitigations or detection controls and then re-scoring. “FMEA-lite” is the stripped, meeting-sized version: skip the formal scoring sheets and standards machinery, list the failure modes, rate each roughly on the three axes, and triage.

The durable move is “enumerate the discrete ways this could fail, then prioritize the failure modes by a combination of impact and likelihood, plus an explicit read on whether you would even notice.” Two of those three axes - impact and likelihood - are the standard risk-ranking pair. The one axis that is FMEA’s own signature is Detection: it forces the question “if this failure occurred, would our existing controls surface it in time, or is it a blind spot?” and pushes hard-to-detect failures up the priority list even when they seem unlikely.

When it helps / when it misleads

It helps when you have a concrete artifact with separable parts - a process with steps, a system with components, a launch with moving pieces - and you want a component-by-component sweep of how each piece fails, rather than a single narrative of how the whole thing fails. The Detection axis is its real contribution: it is genuinely useful to ask, mode by mode, “would we even catch this?”, because the failures that are both consequential and silent are the ones that hurt most, and ordinary risk lists tend to under-weight them.

It misleads or wastes effort when:

The RPN multiplication is taken literally. Severity, Occurrence, and Detection are ordinal ratings (a 6 is not “twice” a 3), and multiplying ordinal scales is mathematically invalid: Shebl, Franklin and Barber (2012) showed most of the possible RPN range is unreachable and that wildly different risk profiles collapse to the same number. The standards body itself abandoned RPN: the AIAG-VDA FMEA handbook (2019) replaced the multiply-and-threshold rule with an Action Priority lookup table precisely because a high-severity failure could hide behind a low RPN. Treating the product of three guesses as a precise priority is false precision.
The scoring is treated as the analysis. The numbers are subjective expert guesses; FMEA team severity estimates correlated only weakly (rs = 0.42) with real incident-database severity in the study above, and teams missed the failures that actually occurred most often. The value is the enumeration and the detection question, not the arithmetic.
The system is interactive rather than componentwise. FMEA decomposes a thing into independent failure modes and scores them one at a time; it does not represent failures that cause or compound each other. Where failures cascade or loop, a feedback model is the right tool, not a mode-by-mode sheet.
It is run as ceremony. A populated FMEA sheet that nobody converts into a mitigation or a new detection control is theatre, the same failure that besets any risk tool.

What the evidence says

The honest grade is P (practitioner), held there deliberately and not raised to M.

What the record supports. FMEA is one of the most established reliability methods in existence: codified by the US military (MIL-P-1629, 1949), carried into the NASA Apollo program in the early 1960s, and made a standard of the automotive industry by Ford in the mid-1970s. As a structured way to enumerate and organize failure modes, its mechanism is plausible and its adoption is enormous and durable. Healthcare systematic reviews report it is widely used and generally recommended: Asgari Dastjerdi et al. (2017) found “most of the studies recommended this technique and had considered it a useful and efficient method in reducing the number of risks,” and Anjalee, Rutter and Samaranayake (2021) found participants consistently rated it an effective group activity for surfacing errors. That is the supported claim: a useful, durable practitioner method for systematically listing and triaging failure modes.

What the record does NOT support. There is no body of controlled, comparative evidence that FMEA produces measurably better or fewer-error outcomes than an unstructured alternative. The positive healthcare reviews rest on subjective participant perception and uncontrolled before-after case studies, both reviews note FMEA’s recognized limitations of subjectivity, poor reproducibility, and limited generalizability, and a minority of studies showed no effect at all. The one rigorous, nameable validity study cuts against the method’s core arithmetic: Shebl, Franklin and Barber (2012) found poor criterion validity (severity correlation rs = 0.42; little relationship between perceived and reported failure frequency), found the team missed 7 of 9 of the most-reported real incidents (omitted doses), and demonstrated the RPN calculation is mathematically flawed. Tellingly, the standards owners agreed: AIAG-VDA dropped RPN multiplication in 2019. So the specific mechanism in this candidate’s one-line description - “likelihood x severity x detection” multiplied into a priority - is the deprecated form, and the rigorous evidence does not back it.

Transfer caveat (required). All of this evidence is from human engineering and clinical teams. The recent work on AI-assisted FMEA (for example LLM-plus-RAG systems that auto-populate FMEA tables, 2024 to 2026) measures automation feasibility and throughput - can a model fill the sheet faster - not whether an AI-produced FMEA yields better decisions or fewer failures than an alternative. There is no controlled validation of this move when AI-augmented. The evidence is transferred from human contexts and not validated for AI-augmented use; the conservative governing grade is therefore P. There is no S- or M-tier research on the move’s effectiveness to borrow from - if anything the strongest study is critical - so there is no optimistic half to cap: the grade is P on its own merits, and it is bounded below M by the negative validity findings.

Excluded on the evidence rule: the general claim that FMEA “reduces errors” is reported only as practitioner perception and uncontrolled before-after data; it is not counted as a controlled effect, and no specific effect-size percentage with a traceable primary source was found to cite.

Why it is / is not a skill here

Verdict: Fold into premortem. This revises the catalog’s prior cand / build / P tag; the concrete reason follows.

The Build burden is to name a distinct, durable cognitive move that no shipped skill already produces, above the roughly 20% overlap ceiling. FMEA-lite’s move is “enumerate the ways this plan could fail and prioritize the failure modes by impact and likelihood, then attach responses.” That is, mechanically, what think-premortem already does. Premortem imagines the plan has already failed, generates the causes broadly, ranks them by likelihood and impact (High/Medium/Low each), and converts each top cause into a mitigation, a leading signal / tripwire, an owner, and a kill criterion - emitting a ranked risk register. FMEA-lite emits a ranked failure-mode register. Two of FMEA’s three scoring axes are identical to premortem’s two ranking axes (Severity = impact, Occurrence = likelihood), and the output artifact is the same shape. The shared working machinery - surface the failures, rate by severity and likelihood, keep the vital few, attach a response to each - is far above the 20% ceiling, not below it.

The only genuinely FMEA-specific ingredient is the Detection axis: scoring, per failure mode, whether current controls would catch it, and bumping the silent failures up the list. The adversarial question is whether that axis is a new move or a column on a register premortem already produces. It is a column. Premortem’s mandatory conversion step already asks, for each top risk, “what is the early sign this is happening?” (the leading signal / tripwire) and “what control reduces it?” (the mitigation). Detection is the same concern stated as a score - how poor is our current ability to notice this? - and the natural response to a low-detection mode is exactly premortem’s existing output: add a tripwire / detection control. So Detection is a scoring lens and an optional column on premortem’s register, not a separable mechanism. It does not even add an orthogonal benefit premortem lacks; it sharpens a question premortem already asks. And it is the weakest-evidenced part of FMEA (the invalid RPN multiplication, deprecated by the standards body in 2019), so shipping a skill around that axis would be building on the method’s worst-supported element. Fold the Detection axis into premortem as an optional “detectability” column on the risk register (score how likely each top risk is to be noticed; for low-detectability risks the mitigation is to install a tripwire), rather than shipping a near-twin.

Why premortem and not the other tested neighbor:

vs issue-tree: issue-tree is a top-down MECE decomposition of a question before any answer exists; it is the wrong target even though FMEA’s enumeration could borrow a tree. FMEA is not primarily a decomposition method - its deliverable is a prioritized, response-bearing risk register, which is premortem’s deliverable, not issue-tree’s. (Issue-tree is the correct fold target for the cause-decomposition tools - fishbone folded there - but FMEA’s center of gravity is risk triage, so it folds to the risk skill.)
vs the risk family generally: reference-class-forecasting is the outside-view base-rate move (different mechanism), woop is goal-obstacle-plan for execution (different mechanism), and kill-criteria-tripwires already folded into premortem. The mechanical match for “list how it could fail and triage with responses” is premortem.

The learning value of this decision: a famous, standards-grade engineering method can still be a fold. FMEA’s identity is its packaging - the three-axis RPN sheet, the DFMEA/PFMEA taxonomy, the aerospace pedigree - not a distinct cognitive move the library lacks. The move is “enumerate and triage failure modes with responses,” which ships as premortem; the one differentiator, the Detection axis, is a column on that same register and is the part the evidence trusts least. Folding it keeps the catalog honest and lets premortem absorb the one transferable asset (the detectability lens) as an option. This mirrors the fishbone-into-issue-tree fold exactly: distinctive packaging, no distinct move.

Lineage and who to read

FMEA originates in mid-century US military reliability engineering: the procedure was described in MIL-P-1629 (1949), later reissued as MIL-STD-1629 / 1629A (1974, 1980; cancelled 1998 but still in use). NASA carried FMEA and its criticality-weighted variant FMECA (FMEA plus a Criticality Analysis) into the Apollo program in the early 1960s and onward through Viking, Voyager and Galileo. Ford brought FMEA to the automotive industry in the mid-1970s after the Pinto affair and split it into design (DFMEA), process (PFMEA) and concept variants; the automotive lineage culminates in the AIAG-VDA FMEA handbook (2019), which replaced the RPN multiplication with the Action Priority (High/Medium/Low) lookup table. “FMEA,” “FMECA,” and “RPN” are generic descriptive engineering terms in common use - no trademark, no attribution required beyond crediting the military/aerospace origin - which is why this entry is documented descriptively and is not flagged as branded. For the honest limits, read Shebl, Franklin and Barber (2012) on RPN validity and the AIAG-VDA handbook itself on why RPN was retired; for the practitioner case, read the ASQ FMEA reference and the healthcare systematic reviews below.

Named sources

US Department of Defense, MIL-P-1629, Procedures for Performing a Failure Mode, Effects and Criticality Analysis (1949; later MIL-STD-1629A, 1980). The founding codification of FMEA/FMECA in military reliability engineering. Foundational.
AIAG & VDA, FMEA Handbook (1st ed., 2019). The current automotive-industry standard; replaced the RPN multiply-and-threshold rule with the Action Priority (H/M/L) lookup table because a high-severity mode could hide behind a low RPN. Standard-setting practitioner reference. (P)
Nada Atef Shebl, Bryony Dean Franklin, Nick Barber, “Failure mode and effects analysis outputs: are they valid?”, BMC Health Services Research 12:150 (2012). Found weak criterion validity (severity rs = 0.42), teams missed the most-reported real failures (7 of 9), and showed the RPN multiplication of ordinal scales is mathematically flawed. The rigorous nameable evidence, and it bounds over-reliance on the score. (Critical literature)
H. Asgari Dastjerdi, E. Khorasani, M.H. Yarmohammadian, M.S. Ahmadzade, “Evaluating the application of failure mode and effects analysis technique in hospital wards: a systematic review,” Journal of Injury and Violence Research 9(1):51-60 (2017). Most reviewed studies recommended FMEA as useful for reducing risks; a minority showed no effect; implementation completeness varied (only 4 of 22 covered all stages). Positive-but-soft adoption evidence. (P, review)
J.A. Lakshika Anjalee, Victoria Rutter, Nithushi R. Samaranayake, “Application of Failure Mode and Effect Analysis (FMEA) to improve medication safety: a systematic review,” Postgraduate Medical Journal 97(1145):168-174 (2021). Across 33 studies, participants consistently rated FMEA an effective but time-consuming and subjective group activity for surfacing errors; evidence rests on perception, not controlled outcomes. (P, review)
ASQ, “Failure Mode and Effects Analysis (FMEA).” Practitioner reference documenting the S/O/D scales and standard procedure. (P)
Failure Mode and Effects Analysis - Wikipedia (origin dates, MIL-P-1629, NASA Apollo, Ford, FMEA vs FMECA, RPN definition). Reference for the lineage facts. (Reference)

Was this page helpful?

Thinking Framework Skills v0.8.0 · 56 frameworks