¶

Quick facts

Phase: Define | Version: 2.0.0 | Category: ideation | License: Apache-2.0

Try it: /hypothesis "Your context here"

Hypothesis¶

A hypothesis is a testable prediction about how a change will affect user behavior or business outcomes. It transforms assumptions into explicit statements that can be validated or invalidated through experimentation. Well-formed hypotheses prevent teams from building features based on untested beliefs and create shared understanding of what success looks like.

When to Use¶

After problem framing, before committing to a solution
When designing experiments or A/B tests
When team members have differing assumptions about user behavior
Before investing significant engineering resources in a feature
When pivoting direction and need to validate the new approach

How to Use¶

Use the /hypothesis slash command:

/hypothesis "Your context here"

Or reference the skill file directly: skills/define-hypothesis/SKILL.md

Instructions¶

When asked to create a hypothesis, follow these steps:

State the Belief Articulate what you believe will happen. Use the structured format: "We believe that [action/change] for [target user] will [expected outcome]." Be specific about the intervention — vague hypotheses can't be tested.
Identify the Target User Define who this hypothesis applies to. A hypothesis about "users" is too broad. Specify the segment: new users in their first week, power users with 10+ sessions, churned users returning, etc.
Define the Expected Outcome What behavior change or result do you expect? Frame it in terms of user actions (complete onboarding, make a purchase, return within 7 days) rather than internal metrics when possible.
Set Success Metrics Choose a primary metric that directly measures the expected outcome. Include secondary metrics that provide context and guardrail metrics that ensure you're not causing harm elsewhere.
Describe Validation Approach How will you test this hypothesis? A/B test, user interviews, prototype testing, cohort analysis? Be specific about sample size, duration, and statistical requirements.
Document Risks and Assumptions What could invalidate this hypothesis beyond the test results? What are you assuming to be true that you haven't validated?

Output Template¶

Hypothesis: [Brief Title]¶

Hypothesis Statement¶

We believe that [specific action or change]

for [target user segment]

will [expected outcome/behavior change]

as measured by [primary success metric]

Background & Rationale¶

Problem Context¶

[Problem context]

Supporting Evidence¶

[Evidence that supports this belief]

Alternative Hypotheses Considered¶

[Alternative approaches]

Target User Segment¶

Definition¶

[User segment definition]

Segment Size¶

[Estimated count or percentage]

Current Behavior¶

[Current state]

Success Metrics¶

Primary Metric¶

Metric	Current Baseline	Target	Minimum Detectable Effect
[Metric name]	[Current value]	[Target value]	[MDE %]

Secondary Metrics¶

Metric	Current Baseline	Expected Direction
[Metric 1]	[Value]	[Increase/Decrease/No change]
[Metric 2]	[Value]	[Increase/Decrease/No change]

Guardrail Metrics¶

Metric	Current Value	Acceptable Range
[Metric 1]	[Value]	[Range]

Validation Approach¶

Method¶

[Validation method]

Sample Size & Duration¶

Sample size: [Number per variant]
Duration: [Time period]
Traffic allocation: [Percentage]

Pass/Fail Criteria¶

Validated if: [Specific criteria]
Invalidated if: [Specific criteria]
Inconclusive if: [Specific criteria]

Risks & Assumptions¶

Key Assumptions¶

[Assumption 1]
[Assumption 2]

Risks¶

[Risk 1]
[Risk 2]

Timeline¶

Phase	Dates	Duration
Setup & instrumentation	[Dates]	[Duration]
Test running	[Dates]	[Duration]
Analysis	[Dates]	[Duration]
Decision	[Date]	—

Example Output¶

Hypothesis: Simplified Onboarding Flow

Hypothesis: Simplified Onboarding Flow¶

Hypothesis Statement¶

We believe that reducing the onboarding flow from 7 steps to 3 essential steps

for new users signing up for a free trial

will increase onboarding completion rate

as measured by percentage of users who complete all onboarding steps within their first session

Background & Rationale¶

Problem Context¶

Our SaaS product has a 34% onboarding completion rate — meaning 66% of new signups never finish setup and experience the core value proposition. User research indicates the current 7-step onboarding feels overwhelming, with significant drop-off occurring at steps 4 and 5 (team invitation and integration setup). Users who don't complete onboarding are 4x more likely to churn within 14 days.

Supporting Evidence¶

Session recordings show users hesitating and abandoning at the team invitation step
Support tickets frequently ask "Can I skip some of these steps?"
Competitor analysis shows market leaders use 3-4 step onboarding flows
Exit survey data: 42% of churned users cite "too complicated to get started"
Hotjar heatmaps show users scrolling to find a "skip" button that doesn't exist

Alternative Hypotheses Considered¶

Progress indicators: Adding a progress bar might reduce anxiety without changing steps — rejected because underlying issue is step count, not visibility
Tooltips/guidance: More help content might reduce confusion — rejected because it adds more cognitive load
Optional steps: Making steps skippable might work — considered as fallback if simplification fails

Target User Segment¶

Definition¶

New users who: - Sign up for a free trial (not paid conversion from trial) - Are the first user from their organization (not invited team members) - Access the product via web (not mobile app)

Segment Size¶

12,400 new trial signups per month meeting these criteria
8,200 (66%) currently fail to complete onboarding

Current Behavior¶

Average time to complete current onboarding: 18 minutes
Step 1-3 completion: 78%
Step 4 (team invitation) completion: 52%
Step 5 (integration) completion: 41%
Full completion (all 7 steps): 34%
Users who complete onboarding activate core feature within 24h: 89%

Success Metrics¶

Primary Metric¶

Metric	Current Baseline	Target	Minimum Detectable Effect
Onboarding completion rate	34%	50%	10% relative lift

Secondary Metrics¶

Metric	Current Baseline	Expected Direction
Time to complete onboarding	18 min	Decrease to <8 min
Day-1 core feature activation	30%	Increase
Support tickets (first 24h)	8.2% of users	Decrease
User satisfaction (post-onboarding)	3.⅖	Increase

Guardrail Metrics¶

Metric	Current Value	Acceptable Range
14-day trial-to-paid conversion	12%	No decrease >5% relative
Team invitation rate (within 7 days)	23%	No decrease >10% relative
Integration connection rate (within 7 days)	31%	No decrease >10% relative

Validation Approach¶

Method¶

A/B test with 50/50 traffic split between: - Control: Current 7-step onboarding flow - Treatment: New 3-step onboarding (account basics, workspace setup, first task creation)

Deferred steps (team invitation, integrations) will be prompted via in-app messaging after initial activation.

Sample Size & Duration¶

Sample size: 3,000 users per variant (6,000 total)
Duration: 14 days of enrollment + 7 days observation window
Traffic allocation: 50% control / 50% treatment
Statistical significance: 95% confidence level
Statistical power: 80%

Pass/Fail Criteria¶

Validated if: Onboarding completion increases by ≥10% relative (34% → 37.4%+) with 95% confidence AND guardrail metrics stay within acceptable range
Invalidated if: Onboarding completion shows no significant change or decreases, OR guardrail metrics breach acceptable range
Inconclusive if: Results don't reach statistical significance within test window — extend test or increase sample

Risks & Assumptions¶

Key Assumptions¶

Users who complete a shorter onboarding will still discover team/integration features later
The 3 essential steps are sufficient to demonstrate core product value
In-app prompts can effectively drive deferred actions
Onboarding completion is a leading indicator of retention (not just correlated)

Risks¶

Feature discovery risk: Users might never set up teams/integrations if not prompted during onboarding
Segment spillover: Results might not generalize to invited users or mobile signups
Novelty effect: Initial lift might fade as users become accustomed to flow
Selection bias: Users who would have completed 7-step flow might be different from marginal completers

Timeline¶

Phase	Dates	Duration
Setup & instrumentation	Jan 15-17, 2026	3 days
Test running	Jan 18-31, 2026	14 days
Observation window	Feb 1-7, 2026	7 days
Analysis	Feb 8-10, 2026	3 days
Decision	Feb 11, 2026	—

Real-World Examples¶

See this skill applied to three different product contexts:

Storevine (B2B): Storevine B2B ecommerce platform — Campaigns v1 first-campaign guided flow hypothesis

Prompt:

/hypothesis

Project: Campaigns — native email marketing for Storevine merchants
Stage: Post-discovery, pre-PRD finalization

Hypothesis I want to define:
- Non-adopter merchants (no active external email tool, <250 customers)
  are ~38% of our active base [fictional] and represented 3 of 8 merchant
  interview participants (P3, P6, and P8)
- Core belief: setup complexity is the barrier — not awareness or price
- Specific hypothesis: a guided first-campaign flow with product-seeded
  templates will drive first-send rate from ~12% [fictional] to ≥30%
  [fictional] within 60 days of GA

Prior work to reference:
- Merchant interview synthesis (Jan 12–28, 2026): P3, P6, and P8 described
  email as "too overwhelming to start" or perennially "on the list"
- Competitive analysis (Feb 2026): Shopify Email's template-first + free
  tier activation is their primary new-merchant onboarding lever
- Problem statement: email-related churn estimated at 4.8 pp [fictional]
  of overall 22% [fictional] annual merchant churn rate

Need: full hypothesis document with success metrics, validation approach,
pass/fail criteria, and risks. Will attach to PRD as primary testable belief.

Output:

Hypothesis: Pre-Populated Templates Drive First Campaign Sends for Non-Adopter Merchants¶

Brainshelf (Consumer): Brainshelf consumer PKM app — Resurface morning email digest hypothesis

Prompt:

/hypothesis

trying to figure out if a morning digest email will actually get people to re-read
their saved stuff. context: brainshelf pkm app, 22k MAU [fictional]. users save
~47 items/month but only go back to read ~9% within 30 days [fictional]. classic
guilt pile problem from interviews.

want to run an A/B test on a morning email that surfaces 3-5 items from their
library based on what they've been reading lately. need a hypothesis doc to
align the team before we commit to building it.

primary metric: resurface item click rate. secondary: actual read completion.
guardrail: don't tank unsubscribe rate.

Output:

Hypothesis: Morning Resurface Email Increases Re-Read Rate¶

Workbench (Enterprise): Workbench enterprise collaboration platform: required-section enforcement hypothesis

Prompt:

/hypothesis

Product: Workbench Blueprints (enterprise doc templates with required sections and approval gates)
Stage: Define phase, post-discovery interviews and problem statement

Hypothesis: Requiring all Blueprint sections to be completed before an author can submit for approval will reduce median time to first approved Blueprint.

Context:
- 38% of Blueprints in closed beta reach approval with ≥1 empty required section [fictional]
- Median time to first approval: 4.0 days [fictional]
- Most rejections are for missing content, not quality [fictional]
- Approvers (dept heads, compliance leads) are the bottleneck -- they reject and wait, or approve with risk
- Target: reduce median approval time to ≤1 day [fictional] (aspirational)
- MDE for experiment: 1.0 day reduction (to ≤3.0 days) [fictional]

Target users: Project leads and document authors at enterprise Workbench accounts
Validation: A/B test in closed beta (80 accounts, ~300 Blueprints/week [fictional])
Primary metric: median time-to-first-approval (days)
Guardrails: author abandonment, author NPS

Stakeholders: Sandra C. (Head of Product), Karen L. (Eng Lead), Leo M. (Data Analyst)

Output:

Hypothesis: Required Blueprint Sections Reduce Time-to-Approval¶

Quality Checklist¶

Before finalizing, verify:

Hypothesis is falsifiable (possible to prove wrong)
Success metric has a specific numeric target
Target user segment is clearly defined
Validation approach is practical and time-bound
Pass/fail criteria are unambiguous
Hypothesis doesn't assume the solution works

Output Format¶

Use the template in references/TEMPLATE.md to structure the output.