Decision Log
A real-time record of context, options considered, criteria used, and reasoning - capturing how a decision was reached, not justifying it after the fact.
Decision Log
Section titled “Decision Log”A decision log is written at the moment of deciding, not after the decision has proven itself. This timing is what gives it value. A document written after the fact is a justification dressed as a record - it knows the outcome and selects the evidence that supports it. A decision log written in the moment of deciding captures the actual options that were on the table, the actual criteria that mattered, and the actual reasons the chosen path seemed best given what was known at the time. Future readers can assess whether the reasoning was sound without being misled by hindsight selection.
The context section is the most undervalued part of a decision log. Decisions made six months ago often look inexplicable without it. “Why did we deploy on a Friday?” makes no sense unless the reader knows that the client had a board presentation Monday and the demo environment was broken. Context is not a formality - it is the load-bearing section that makes everything that follows legible to a future reader who was not in the room.
An ADR (architecture decision record) is a decision-log specialized for software architecture choices. Decision-log is the general form. The same structure applies to product decisions, process changes, hiring decisions, vendor selections, and any other choice where the reasoning will matter later. The specialization of ADR for architecture adds conventions about drivers and consequences; the general decision-log form is deliberately more open.
Structural conventions
Section titled “Structural conventions”- Context section captures the situation and constraints that existed at decision time
- Options section lists the alternatives actually considered, not a post-hoc menu
- Criteria section names the values or constraints that governed the evaluation
- Decision section states the chosen option and the reasoning - the “because,” not just the “what”
- Written at decision time, not reconstructed afterward
- Does not require the decision to have been correct - a good decision log records good reasoning, not good outcomes
When to use
Section titled “When to use”Any significant decision where the reasoning will matter to future team members: vendor selections, technology choices, product pivots, process changes, and hiring decisions. Especially valuable in onboarding contexts where new team members need to understand why things are the way they are, and in governance contexts where the auditability of reasoning is required.
When not to use
Section titled “When not to use”Routine operational decisions that will not affect future readers. Avoid when the audience needs the decision communicated rather than the reasoning behind it, in real-time situations where structured logging overhead is not warranted, and when the outcome is immediately visible and the reasoning is self-evident.
Pairs well with
Section titled “Pairs well with”pragmatic-architect, direct-communicator, adr, matter-of-fact
Often confused with
Section titled “Often confused with”adr: An ADR (architecture decision record) is a decision-log specialized for software architecture choices, with conventions specific to that domain - drivers, status, consequences in a technical sense. Decision-log is the general form that applies to any significant organizational choice. Every ADR follows the decision-log pattern; not every decision-log is an ADR. The distinction is scope and specialization, not structure.
Instruction
Section titled “Instruction”Write as a decision log. Organize around four sections: context (what was true when thisdecision was made), options (what was actually considered, not a post-hoc list), criteria(what values or constraints governed the evaluation), and decision (what was chosen and why -the because, not just the what). Write as if the reader was not in the room. Do not justifythe decision in hindsight - record the reasoning as it actually existed at decision time.A good decision log records good reasoning; it does not require the decision to have beencorrect in hindsight. Do not include pleasantries or framing prose; go directly to thestructured record.Related
Section titled “Related”Pairs well with
Section titled “Pairs well with”Pragmatic Architect, Direct Communicator, Architecture Decision Record, Matter of Fact
Avoid with
Section titled “Avoid with”Devotional Reflection, Playful, Warm
Often confused with
Section titled “Often confused with”Examples
Section titled “Examples”Decision: 2026-Q2 Standup Format
Section titled “Decision: 2026-Q2 Standup Format”Status: Decided Date: 2026-04-08 Owner: Lina Acosta (Platform Eng Manager) Stakeholders: Platform team (11 engineers); Head of Engineering (informed, not approving)
Context
Section titled “Context”The Platform team is 11 engineers across four timezones: US Pacific (3), US Eastern (3), UK (2), India (3). The current synchronous daily standup runs at 9am Pacific, which is 9:30pm IST for the three India-based engineers.
Two recurring problems are now well-evidenced rather than anecdotal:
- Attendance inequity. Q1 attendance data: India engineers averaged 3.2 of 5 weekly standups; US-based engineers averaged 4.6. The gap correlates with the meeting time, not with the engineers.
- Information loss. Standups run 14 minutes on average. Roughly 4 minutes drive concrete action. The remainder is verbal status that does not persist. We have multiple recent instances of engineers re-diagnosing problems that teammates solved earlier the same day. The most concrete: a 401 auth fix discussed in standup on March 3, then re-diagnosed by a different engineer on March 4 because they were not on the call.
The status quo is not free. It is paid disproportionately by the India engineers and intermittently by anyone who misses a meeting.
Options Considered
Section titled “Options Considered”Option A: Keep the current sync standup
Section titled “Option A: Keep the current sync standup”Cost concentrated on India team. Information loss continues. No change required, no risk introduced.
Option B: Rotate the standup time weekly across timezones
Section titled “Option B: Rotate the standup time weekly across timezones”Spreads the inequity rather than removing it. Calendar churn for everyone. Still does not solve the information persistence problem.
Option C: Two sync standups (Americas + Europe/India)
Section titled “Option C: Two sync standups (Americas + Europe/India)”Splits the team into two information silos. Doubles the meeting load on anyone bridging both. Cross-region context gets worse, not better.
Option D: Async-first standup with weekly sync working session
Section titled “Option D: Async-first standup with weekly sync working session”Eliminates the timezone tax. Creates a searchable record. Preserves a real-time slot for discussion that benefits from it. Requires behavior change.
Option E: No standup at all
Section titled “Option E: No standup at all”Lowest meeting cost. Highest information cost. Rejected without serious consideration; the team is not co-located enough to absorb the loss of any structured status mechanism.
Criteria
Section titled “Criteria”The decision is being made against these criteria, in priority order:
- Equity across timezones. No single timezone should bear a disproportionate share of off-hours meeting cost.
- Information persistence. Daily status should be searchable, not ephemeral.
- Blocker resolution speed. Blocked items should route to the right person quickly.
- Real-time bandwidth preserved. The team still needs occasional real-time exchange for hard problems.
- Reversibility. Whatever we choose should be revertible within a sprint if it does not work.
Decision
Section titled “Decision”Adopt Option D for a 30-day trial starting 2026-04-13.
Specifics:
- Daily async post in
#team-standup, three fields: Shipped, In progress, Blocked or at risk. Posted by 10am local time. Blockers @mention the unblocker. - 9am Pacific sync slot becomes a 60-minute Thursday working session.
- Day 15 informal check-in. Day 30 formal evaluation against success criteria.
Success criteria at day 30
Section titled “Success criteria at day 30”The trial extends if at least two of three are positive:
- Median blocker resolution time during overlap windows under 2 hours.
- At least 9 of 11 engineers posting at least 4 of 5 weekdays.
- Team survey shows equal or better context on teammates’ work versus the sync model.
If two of three fail, revert to a rotating-time sync standup (not the current fixed time, which is the worst option).
Reasoning
Section titled “Reasoning”Option D scores best on criteria 1, 2, and 3, neutral on 4, and equivalent to most other options on 5. Option A scores poorly on 1 and 2. Options B and C reduce one cost while introducing a new one. Option E is dominated by D on every criterion except meeting count, which is not in our top five.
The deciding factor was criterion 1. The current attendance gap is not a behavior problem; it is a schedule problem. Once we accepted that, the options that preserved the schedule were no longer viable.
Open Questions
Section titled “Open Questions”- Does the Thursday working session need a standing agenda template, or is a free-form shared doc sufficient? Defer to day 15 check-in.
- Should “Blocked or at risk” be split into two fields? Some engineers may underreport “at risk” items. Defer to day 30 review.
- If we revert, what is the rotating-time schedule? Owner: Lina. Due before day 30 in case revert is needed.
Decision: 2026 Morning Routine Format
Section titled “Decision: 2026 Morning Routine Format”Date. 2026-05-14 Decider. Self Status. Decided. Effective 2026-05-15.
Context
Section titled “Context”Current morning pattern: wake at 6:15, phone in hand within 60 seconds, first deliberate action of the day around 7:30. Pattern has held for approximately eighteen months. Self-reported mornings range from “fine” to “already behind by 7:00.” Two prior attempts at a structured morning routine (in 2024 and early 2025) failed within two weeks. Both prior attempts were ambitious in scope (5:00 wake, full workout, journaling block). Both collapsed on the first travel week.
Trigger for revisiting: three consecutive weeks of arriving at the 8:30 standup already irritated, traceable in retrospect to specific phone content seen between 6:15 and 6:30.
Options considered
Section titled “Options considered”Option A. Keep the current pattern. No change. Phone-first morning continues. Accept the cost as priced in.
Option B. Aggressive overhaul. 5:00 wake, one hour of structured activities (workout, journal, plan), phone after 7:00. Similar to the 2024 attempt.
Option C. Minimal pre-phone routine. Phone sleeps in kitchen. Wake at 6:15. Water, light, one quiet activity (15 minutes total) before phone enters. No change to wake time.
Option D. Phone software lock until 7:30. Keep phone bedside. Use a screen-time lock to make it unavailable until 7:30. Wake time and other behavior unchanged.
Criteria
Section titled “Criteria”- Survives a bad day (sick child, travel, deadline) without collapsing.
- Total time cost under thirty minutes on a normal day.
- Does not require new equipment, new apps, or a new wake time.
- Failure mode is recoverable (skipping one day does not end the routine).
- Demonstrably different from the 2024 and 2025 attempts in shape, not just intensity.
Evaluation against criteria
Section titled “Evaluation against criteria”| Criterion | A | B | C | D |
|---|---|---|---|---|
| Survives bad days | N/A (no routine) | No (prior evidence) | Yes | Partial (lock can be overridden) |
| Under 30 min total | Yes | No | Yes | Yes |
| No new equipment / wake time | Yes | No | Yes | Partial (requires app) |
| Recoverable failure | N/A | No | Yes | Yes |
| Different from prior attempts | N/A | No | Yes | Yes |
Decision
Section titled “Decision”Option C. Phone sleeps in the kitchen starting tonight. Morning sequence: water (1 min), window or step outside (2 min), one chosen quiet activity (10 to 12 min, alternating weekly between reading and planning on paper), then phone. Total target time: 15 minutes. Wake time unchanged at 6:15.
Reasoning
Section titled “Reasoning”Option A is rejected because the cost is now visible and traceable to a specific behavior. Continuing it is no longer a default; it is a choice with a price tag.
Option B is rejected on prior evidence. Two previous attempts at this shape failed within two weeks. The failure was not motivation; it was design. Routines that require ideal conditions do not survive non-ideal ones, and non-ideal conditions are most days.
Option D is rejected because it depends on a software constraint that can be disabled in three seconds. The phone-in-another-room rule is harder to circumvent because circumventing it requires walking to another room, which is enough friction to break the reflex.
Option C is chosen because it changes the structural variable (phone location) rather than the willpower variable (resisting the phone). It is recoverable because the only hard rule is the phone location, and the rest of the routine can compress to two minutes on a hard day without breaking.
What we are explicitly not deciding
Section titled “What we are explicitly not deciding”- Wake time. Stays at 6:15. May revisit in ninety days.
- Workout placement. Stays in current evening slot. Not a morning concern.
- Weekend variation. Will be evaluated at the thirty-day checkpoint, not now.
Review
Section titled “Review”Reassess at thirty days (2026-06-14) and ninety days (2026-08-13). At each checkpoint, evaluate: completion rate, subjective day-quality on routine vs non-routine days, and whether any element should be removed (not added).
Decision Log on: Choosing between Postgres and DynamoDB
Section titled “Decision Log on: Choosing between Postgres and DynamoDB”Decision: Notification System Primary Storage
Section titled “Decision: Notification System Primary Storage”Status: Decided Date: 2026-05-15 Owner: Ana Velasquez (Tech Lead, Notifications) Stakeholders: Marcus Chen (Senior Eng, prototype owner); Priya Singh (PM); 4-person on-call rotation Forum: Architecture meeting, Wednesday 2026-05-13 2pm Pacific
Context
Section titled “Context”Lattice Notify is a 50-person Series B with 8 backend engineers. The product currently runs as a monolith on Postgres. We are building a real-time notification system that needs new persistent storage. Launch volume is projected at 500K events/day. The CRO assigned 60% confidence on Friday 2026-05-08 to a Slack-partnership deal closing in Q3 2026; if it lands, volume reaches roughly 5M events/day in twelve months.
Priya set a Friday 2026-05-15 deadline so the team can plan the next sprint. The on-call rotation is four engineers, none of whom have operated DynamoDB in production. Ana has scaled Postgres to this range before in a prior role; Marcus has prototyped on DynamoDB but not operated it.
The cost of being wrong: roughly 3-6 weeks of rework if a migration is forced later. The cost of being right: roughly a year of stability on the chosen storage path.
Options Considered
Section titled “Options Considered”Option A: Postgres with a queue
Section titled “Option A: Postgres with a queue”Reuse the existing Postgres cluster, add a notifications schema, put a queue (likely SQS) in front to absorb write spikes. Known operational profile. Sharding required in the 10x scenario, but reversible.
Option B: DynamoDB
Section titled “Option B: DynamoDB”Stand up a new DynamoDB table for notifications. Natural fit for the write-heavy, key-lookup, time-ordered access pattern. Scales transparently in the 10x scenario. Adds a second storage system the team has never operated. Cross-database queries to existing Postgres tables move into application code.
Option C: Postgres now, plan a Dynamo migration if and when the Slack deal closes
Section titled “Option C: Postgres now, plan a Dynamo migration if and when the Slack deal closes”Hybrid. Defers the new-system learning curve until the volume actually arrives. Accepts a 3-6 week migration project in the future as an explicit, scheduled cost.
Criteria
Section titled “Criteria”In priority order:
- Operational safety for the on-call rotation. A four-person rotation cannot be on-call for two storage systems they cannot debug.
- Cost of being wrong. Recoverable errors are preferred over diffuse, hard-to-roll-back errors.
- Fit to the launch-day access pattern at 500K events/day.
- Fit to the 10x scenario, weighted by the probability of that scenario.
- Cross-system query cost with billing, analytics, and product reads that live in Postgres.
Decision
Section titled “Decision”Adopt Option C: ship on Postgres with a queue. Pre-commit to a Dynamo migration project if and when the Slack deal closes.
Specifics:
- Notifications schema lands in the existing Postgres cluster behind an SQS queue. Marcus owns the schema design and queue integration. Target: in production by 2026-06-15.
- A trigger condition is defined now, not later: if Slack-partnership volume puts us above 2M events/day on a 30-day rolling average, the Dynamo migration project begins the following sprint. Owner: Ana.
- The Dynamo design Marcus prototyped this week is preserved in the design-docs repo with a status of “deferred design, ready to revive.”
- The 4-person on-call rotation does not take on a second storage system at launch.
Reasoning
Section titled “Reasoning”Option C scores best on criteria 1, 2, and 3. It scores worst on criterion 4 if the Slack deal lands, but the pre-committed trigger condition converts that worst case into a scheduled project rather than an emergency. Option B scores best on criterion 4 but worst on criterion 1, and the team agreed that operational safety for the on-call rotation was the right load-bearing constraint. Option A and Option C are structurally similar; the addition of the explicit trigger condition is what makes Option C honest rather than wishful.
The deciding factor was the recognition (Ana, in the Wednesday meeting) that the decision was really about how much we believe the Slack deal will close. At 60% confidence, the expected operational cost of running on Dynamo from day one is higher than the expected migration cost of moving to Dynamo later.
Open Questions
Section titled “Open Questions”- Exact trigger threshold for the Dynamo migration: 2M events/day on a 30-day rolling average is the starting number; revisit at the 2026-Q3 architecture review with real volume data.
- Should the SQS queue be replaced with the existing internal event bus? Defer to Marcus’s schema design review.
- Who owns the Dynamo runbook drafting if the trigger fires? Owner unassigned; revisit when the Slack deal closes.