Skip to content

Quick facts

Phase: Define | Version: 2.0.0 | Category: ideation | License: Apache-2.0

Try it: /hypothesis "Your context here"

Hypothesis

A hypothesis is a testable prediction about how a change will affect user behavior or business outcomes. It transforms assumptions into explicit statements that can be validated or invalidated through experimentation. Well-formed hypotheses prevent teams from building features based on untested beliefs and create shared understanding of what success looks like.

When to Use

  • After problem framing, before committing to a solution
  • When designing experiments or A/B tests
  • When team members have differing assumptions about user behavior
  • Before investing significant engineering resources in a feature
  • When pivoting direction and need to validate the new approach

How to Use

Use the /hypothesis slash command:

/hypothesis "Your context here"

Or reference the skill file directly: skills/define-hypothesis/SKILL.md

Instructions

When asked to create a hypothesis, follow these steps:

  1. State the Belief Articulate what you believe will happen. Use the structured format: "We believe that [action/change] for [target user] will [expected outcome]." Be specific about the intervention — vague hypotheses can't be tested.

  2. Identify the Target User Define who this hypothesis applies to. A hypothesis about "users" is too broad. Specify the segment: new users in their first week, power users with 10+ sessions, churned users returning, etc.

  3. Define the Expected Outcome What behavior change or result do you expect? Frame it in terms of user actions (complete onboarding, make a purchase, return within 7 days) rather than internal metrics when possible.

  4. Set Success Metrics Choose a primary metric that directly measures the expected outcome. Include secondary metrics that provide context and guardrail metrics that ensure you're not causing harm elsewhere.

  5. Describe Validation Approach How will you test this hypothesis? A/B test, user interviews, prototype testing, cohort analysis? Be specific about sample size, duration, and statistical requirements.

  6. Document Risks and Assumptions What could invalidate this hypothesis beyond the test results? What are you assuming to be true that you haven't validated?

Output Template

Hypothesis: [Brief Title]

Hypothesis Statement

We believe that [specific action or change]

for [target user segment]

will [expected outcome/behavior change]

as measured by [primary success metric]

Background & Rationale

Problem Context

[Problem context]

Supporting Evidence

[Evidence that supports this belief]

Alternative Hypotheses Considered

[Alternative approaches]

Target User Segment

Definition

[User segment definition]

Segment Size

[Estimated count or percentage]

Current Behavior

[Current state]

Success Metrics

Primary Metric

Metric Current Baseline Target Minimum Detectable Effect
[Metric name] [Current value] [Target value] [MDE %]

Secondary Metrics

Metric Current Baseline Expected Direction
[Metric 1] [Value] [Increase/Decrease/No change]
[Metric 2] [Value] [Increase/Decrease/No change]

Guardrail Metrics

Metric Current Value Acceptable Range
[Metric 1] [Value] [Range]

Validation Approach

Method

[Validation method]

Sample Size & Duration

  • Sample size: [Number per variant]
  • Duration: [Time period]
  • Traffic allocation: [Percentage]

Pass/Fail Criteria

  • Validated if: [Specific criteria]
  • Invalidated if: [Specific criteria]
  • Inconclusive if: [Specific criteria]

Risks & Assumptions

Key Assumptions

  • [Assumption 1]
  • [Assumption 2]

Risks

  • [Risk 1]
  • [Risk 2]

Timeline

Phase Dates Duration
Setup & instrumentation [Dates] [Duration]
Test running [Dates] [Duration]
Analysis [Dates] [Duration]
Decision [Date]

Example Output

Hypothesis: Simplified Onboarding Flow

Hypothesis: Simplified Onboarding Flow

Hypothesis Statement

We believe that reducing the onboarding flow from 7 steps to 3 essential steps

for new users signing up for a free trial

will increase onboarding completion rate

as measured by percentage of users who complete all onboarding steps within their first session

Background & Rationale

Problem Context

Our SaaS product has a 34% onboarding completion rate — meaning 66% of new signups never finish setup and experience the core value proposition. User research indicates the current 7-step onboarding feels overwhelming, with significant drop-off occurring at steps 4 and 5 (team invitation and integration setup). Users who don't complete onboarding are 4x more likely to churn within 14 days.

Supporting Evidence

  • Session recordings show users hesitating and abandoning at the team invitation step
  • Support tickets frequently ask "Can I skip some of these steps?"
  • Competitor analysis shows market leaders use 3-4 step onboarding flows
  • Exit survey data: 42% of churned users cite "too complicated to get started"
  • Hotjar heatmaps show users scrolling to find a "skip" button that doesn't exist

Alternative Hypotheses Considered

  • Progress indicators: Adding a progress bar might reduce anxiety without changing steps — rejected because underlying issue is step count, not visibility
  • Tooltips/guidance: More help content might reduce confusion — rejected because it adds more cognitive load
  • Optional steps: Making steps skippable might work — considered as fallback if simplification fails

Target User Segment

Definition

New users who: - Sign up for a free trial (not paid conversion from trial) - Are the first user from their organization (not invited team members) - Access the product via web (not mobile app)

Segment Size

  • 12,400 new trial signups per month meeting these criteria
  • 8,200 (66%) currently fail to complete onboarding

Current Behavior

  • Average time to complete current onboarding: 18 minutes
  • Step 1-3 completion: 78%
  • Step 4 (team invitation) completion: 52%
  • Step 5 (integration) completion: 41%
  • Full completion (all 7 steps): 34%
  • Users who complete onboarding activate core feature within 24h: 89%

Success Metrics

Primary Metric

Metric Current Baseline Target Minimum Detectable Effect
Onboarding completion rate 34% 50% 10% relative lift

Secondary Metrics

Metric Current Baseline Expected Direction
Time to complete onboarding 18 min Decrease to <8 min
Day-1 core feature activation 30% Increase
Support tickets (first 24h) 8.2% of users Decrease
User satisfaction (post-onboarding) 3.⅖ Increase

Guardrail Metrics

Metric Current Value Acceptable Range
14-day trial-to-paid conversion 12% No decrease >5% relative
Team invitation rate (within 7 days) 23% No decrease >10% relative
Integration connection rate (within 7 days) 31% No decrease >10% relative

Validation Approach

Method

A/B test with 50/50 traffic split between: - Control: Current 7-step onboarding flow - Treatment: New 3-step onboarding (account basics, workspace setup, first task creation)

Deferred steps (team invitation, integrations) will be prompted via in-app messaging after initial activation.

Sample Size & Duration

  • Sample size: 3,000 users per variant (6,000 total)
  • Duration: 14 days of enrollment + 7 days observation window
  • Traffic allocation: 50% control / 50% treatment
  • Statistical significance: 95% confidence level
  • Statistical power: 80%

Pass/Fail Criteria

  • Validated if: Onboarding completion increases by ≥10% relative (34% → 37.4%+) with 95% confidence AND guardrail metrics stay within acceptable range
  • Invalidated if: Onboarding completion shows no significant change or decreases, OR guardrail metrics breach acceptable range
  • Inconclusive if: Results don't reach statistical significance within test window — extend test or increase sample

Risks & Assumptions

Key Assumptions

  • Users who complete a shorter onboarding will still discover team/integration features later
  • The 3 essential steps are sufficient to demonstrate core product value
  • In-app prompts can effectively drive deferred actions
  • Onboarding completion is a leading indicator of retention (not just correlated)

Risks

  • Feature discovery risk: Users might never set up teams/integrations if not prompted during onboarding
  • Segment spillover: Results might not generalize to invited users or mobile signups
  • Novelty effect: Initial lift might fade as users become accustomed to flow
  • Selection bias: Users who would have completed 7-step flow might be different from marginal completers

Timeline

Phase Dates Duration
Setup & instrumentation Jan 15-17, 2026 3 days
Test running Jan 18-31, 2026 14 days
Observation window Feb 1-7, 2026 7 days
Analysis Feb 8-10, 2026 3 days
Decision Feb 11, 2026

Real-World Examples

See this skill applied to three different product contexts:

Storevine (B2B): Storevine B2B ecommerce platform — Campaigns v1 first-campaign guided flow hypothesis

Prompt:

/hypothesis

Project: Campaigns — native email marketing for Storevine merchants
Stage: Post-discovery, pre-PRD finalization

Hypothesis I want to define:
- Non-adopter merchants (no active external email tool, <250 customers)
  are ~38% of our active base [fictional] and represented 3 of 8 merchant
  interview participants (P3, P6, and P8)
- Core belief: setup complexity is the barrier — not awareness or price
- Specific hypothesis: a guided first-campaign flow with product-seeded
  templates will drive first-send rate from ~12% [fictional] to ≥30%
  [fictional] within 60 days of GA

Prior work to reference:
- Merchant interview synthesis (Jan 12–28, 2026): P3, P6, and P8 described
  email as "too overwhelming to start" or perennially "on the list"
- Competitive analysis (Feb 2026): Shopify Email's template-first + free
  tier activation is their primary new-merchant onboarding lever
- Problem statement: email-related churn estimated at 4.8 pp [fictional]
  of overall 22% [fictional] annual merchant churn rate

Need: full hypothesis document with success metrics, validation approach,
pass/fail criteria, and risks. Will attach to PRD as primary testable belief.

Output:

Hypothesis: Pre-Populated Templates Drive First Campaign Sends for Non-Adopter Merchants

Brainshelf (Consumer): Brainshelf consumer PKM app — Resurface morning email digest hypothesis

Prompt:

/hypothesis

trying to figure out if a morning digest email will actually get people to re-read
their saved stuff. context: brainshelf pkm app, 22k MAU [fictional]. users save
~47 items/month but only go back to read ~9% within 30 days [fictional]. classic
guilt pile problem from interviews.

want to run an A/B test on a morning email that surfaces 3-5 items from their
library based on what they've been reading lately. need a hypothesis doc to
align the team before we commit to building it.

primary metric: resurface item click rate. secondary: actual read completion.
guardrail: don't tank unsubscribe rate.

Output:

Hypothesis: Morning Resurface Email Increases Re-Read Rate

Workbench (Enterprise): Workbench enterprise collaboration platform: required-section enforcement hypothesis

Prompt:

/hypothesis

Product: Workbench Blueprints (enterprise doc templates with required sections and approval gates)
Stage: Define phase, post-discovery interviews and problem statement

Hypothesis: Requiring all Blueprint sections to be completed before an author can submit for approval will reduce median time to first approved Blueprint.

Context:
- 38% of Blueprints in closed beta reach approval with ≥1 empty required section [fictional]
- Median time to first approval: 4.0 days [fictional]
- Most rejections are for missing content, not quality [fictional]
- Approvers (dept heads, compliance leads) are the bottleneck -- they reject and wait, or approve with risk
- Target: reduce median approval time to ≤1 day [fictional] (aspirational)
- MDE for experiment: 1.0 day reduction (to ≤3.0 days) [fictional]

Target users: Project leads and document authors at enterprise Workbench accounts
Validation: A/B test in closed beta (80 accounts, ~300 Blueprints/week [fictional])
Primary metric: median time-to-first-approval (days)
Guardrails: author abandonment, author NPS

Stakeholders: Sandra C. (Head of Product), Karen L. (Eng Lead), Leo M. (Data Analyst)

Output:

Hypothesis: Required Blueprint Sections Reduce Time-to-Approval

Quality Checklist

Before finalizing, verify:

  • Hypothesis is falsifiable (possible to prove wrong)
  • Success metric has a specific numeric target
  • Target user segment is clearly defined
  • Validation approach is practical and time-bound
  • Pass/fail criteria are unambiguous
  • Hypothesis doesn't assume the solution works

Output Format

Use the template in references/TEMPLATE.md to structure the output.