Measure Experiment Design: Storevine Campaigns
Scenario
Four weeks before the Campaigns GA date, Storevine’s growth PM formalized the A/B experiment design that will test the primary product hypothesis: that the guided first-campaign flow with pre-populated product templates drives a materially higher first-send rate among non-adopter merchants than the standard blank-canvas campaign creation experience. The experiment runs for 60 days from GA and is the primary validation mechanism for the hypothesis defined during the Define phase.
Source Notes:
- Evan Miller, “How Not To Run An A/B Test” (evanmiller.org/how-not-to-run-an-ab-test.html) . establishes the importance of pre-defined sample sizes and fixed test durations to avoid peeking bias and inflated false positive rates, the protocol followed in this experiment design.
- Ronald Fisher’s convention of p < 0.05 as the standard significance threshold for scientific experiments, established in “The Design of Experiments” (1935) . the alpha level used in this experiment, consistent with industry practice for product A/B tests.
- Nielsen Norman Group research on first-use experiences (nngroup.com) . establishes that guided first-session paths producing tangible value are the primary driver of tool retention for new users, providing the behavioral research basis for the hypothesis being tested.
Prompt
/experiment-design
Project: Campaigns . Campaigns guided first-campaign flowExperiment: Does the guided first-campaign flow increase first-send rate for non-adopter merchants?
Hypothesis (from Define phase doc):- We believe pre-populated templates for non-adopter merchants (<250 customers [fictional], no external email tool) will drive first-send rate from 12% [fictional] to ≥30% [fictional] within 60 days of GA
Variants:- Control: Standard Campaigns creation flow (blank template editor, named segment library, no pre-population)- Treatment: Guided first-campaign flow (product-seeded template, audience defaulted to "Customers who purchased in the last 90 days")
Sample: ~6,800 eligible non-adopter merchants [fictional]; need enough per variant to detect a 8 pp improvement with 80% power
Run period: April 28 - June 27, 2026 (60 days from GA)
Need: full experiment design with sample size calculation, successcriteria, risks, and implementation notes.Output
Experiment Design: Guided First-Campaign Flow for Non-Adopter Merchants
Overview
| Field | Value |
|---|---|
| Experiment Name | Campaigns: Guided First-Campaign Flow A/B Test |
| Owner | Growth PM, Storevine |
| Start Date | April 28, 2026 (Campaigns GA day) |
| End Date | June 27, 2026 (60 days) |
| Status | Ready |
Hypothesis
We believe providing non-adopter merchants with a guided first-campaign flow that pre-populates a campaign template with their top products and defaults the audience to recent purchasers
for Storevine merchants with no active external email tool and fewer than 250 customers [fictional], representing the non-adopter segment (~38% of active merchants [fictional])
will increase the rate at which these merchants complete and send their first Campaigns email campaign within 60 days of Campaigns GA
as measured by first campaign send completion rate (the percentage of eligible non-adopter merchants who send at least one campaign within 60 days of account activation)
Background
Merchant interviews (January 2026) identified three participants with no email program who cited setup complexity as the barrier . not price, not awareness. Interview Insight 2 categorized this non-adopter segment as the highest-ROI activation target for Campaigns v1 because there is no migration cost, no existing workflow to replicate, and no seasonal disruption risk. The hypothesis was formalized in the Define phase with a target of 30% [fictional] first-send rate against a current baseline of 12% [fictional]. The experiment operationalizes that hypothesis: it directly tests whether removing the blank-canvas complexity with pre-populated content is the activation mechanism, or whether some other factor drives the difference.
Variants
Control (A)
Description: The standard Campaigns campaign creation flow that all merchants see . a blank template editor with a library of 8 generic pre-built templates, and the named segment library as the audience selection step. No content is pre-filled; the merchant builds their campaign from scratch.
Details:
- Entry point: Campaigns section of Storevine admin (same for both variants)
- Template: Merchant selects from generic library or starts blank
- Audience: Named segment library with no pre-selected segment
- No guided completion prompts or progress indicators
Screenshot/Mockup: Figma . Campaigns creation flow v4 / Control variant
Treatment (B)
Description: The guided first-campaign flow, shown only to merchants who have no prior sent campaigns (detected at entry). The flow pre-populates a campaign template with the merchant’s top 3 products by sales volume (last 90 days [fictional]) and defaults the audience to “Customers who purchased in the last 90 days.”
Details:
- Entry point: Campaigns section of Storevine admin (same for both variants)
- Template: Pre-populated with merchant’s top 3 products (name, image, price from product catalog)
- Audience: Defaulted to “Customers who purchased in the last 90 days” segment; merchant may change this selection
- Guided completion prompt: Subject line field is pre-focused; progress indicator shows 3 steps (Template → Subject Line → Send)
Screenshot/Mockup: Figma . Campaigns creation flow v4 / Treatment variant
Metrics
Primary Metric
| Metric | Definition | Current Baseline | Minimum Detectable Effect |
|---|---|---|---|
| First campaign send rate | % of eligible non-adopter merchants who complete and send at least one campaign within 60 days of account activation | 12% [fictional] | 8 percentage points (absolute) |
Secondary Metrics
| Metric | Definition | Purpose |
|---|---|---|
| Median time to first send | Median time (minutes) from first Campaigns flow entry to first campaign sent | Track whether treatment reduces the time cost, not just the completion rate |
| Second campaign send rate | % of non-adopter merchants who send a second campaign within 90 days | Assess whether treatment effect persists beyond the first send (retention signal) |
| Campaigns activation rate | % of eligible merchants who open the Campaigns flow at least once | Confirm both variants have comparable entry rates; if significantly different, investigate whether feature flag rollout created selection bias |
Guardrail Metrics
| Metric | Definition | Threshold |
|---|---|---|
| Email unsubscribe rate | % of recipients who unsubscribe from any campaign sent during the test period | Must not exceed 2.0% . above this level, inbox providers may begin filtering Storevine domain sends |
| Spam complaint rate | % of recipients who report a campaign as spam | Must not exceed 0.1% . above this level, SendGrid domain reputation degrades |
Sample Size & Duration
Sample Size Calculation
| Parameter | Value |
|---|---|
| Baseline conversion rate | 12% [fictional] |
| Minimum detectable effect (MDE) | 8 percentage points absolute (12% → 20%) |
| Statistical significance (alpha) | 0.05 (two-tailed) |
| Statistical power (1-beta) | 0.80 |
| Users per variant | 1,180 merchants (rounded to 1,200 [fictional]) |
| Total users needed | 2,400 merchants [fictional] |
Duration Estimate
| Parameter | Value |
|---|---|
| Daily eligible traffic | ~113 newly eligible non-adopter merchants/day [fictional] (6,800 total / 60 days) |
| Traffic allocation | 100% of eligible segment to experiment (no holdout) |
| Users per day in experiment | ~113 merchants/day [fictional] across both variants |
| Minimum duration | ~22 days to reach 2,400 merchants [fictional] |
| Recommended duration | 60 days . covers the full hypothesis measurement window and two distinct seasonal sub-periods (May promotions + early June) |
Audience Targeting
Inclusion Criteria
- Active Storevine merchant account (account status: active, not suspended)
- No prior sent campaign in Campaigns (first_send_completed event has never fired for this merchant_id)
- No active external email tool API integration (Klaviyo, Mailchimp, or other detected integrations are absent)
- Fewer than 250 customers in the Storevine account [fictional]
- At least one published product in the Storevine storefront (required for pre-population in treatment variant)
Exclusion Criteria
- Storevine internal employee accounts and QA test accounts
- Merchants who have accessed the Campaigns beta during the pre-GA beta period
- Merchants currently enrolled in any other active Campaigns experiment
Traffic Allocation
| Variant | Allocation |
|---|---|
| Control (A) | 50% |
| Treatment (B) | 50% |
Success Criteria
Win (Ship Treatment)
Treatment variant’s first campaign send rate is ≥30% [fictional] at 60 days with p < 0.05 (two-tailed), and no guardrail metric degrades beyond its defined threshold. In this case, the guided first-campaign flow becomes the default experience for all new non-adopter merchants.
Loss (Keep Control)
Treatment variant’s first campaign send rate is lower than control or does not reach statistical significance, AND the point estimate is below 20% [fictional]. In this case, the guided flow design is revisited . specifically whether the pre-populated template content quality is driving abandonment rather than the blank-canvas problem.
Inconclusive (More Data Needed)
Treatment variant’s first campaign send rate is 20 - 29% [fictional] at 60 days but does not reach p < 0.05. In this case, run qualitative follow-up interviews with 10 non-completing treatment merchants to identify whether design, content, or audience-selection friction is blocking the final completion step before deciding whether to extend or iterate.
Risks & Mitigations
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Template quality drives abandonment rather than reducing it (auto-populated products look low-quality or irrelevant) | Medium | High | Design QA of template rendering across five representative merchant account types before GA; include an explicit merchant preview step before final send in the treatment variant |
| GA-day novelty effect inflates early treatment adoption (merchants curious about a new product are more likely to send regardless of the flow) | Medium | Medium | Use a 60-day window rather than a 7- or 14-day snapshot; novelty effects typically normalize within 2 weeks |
| Activation rate differs materially between variants, indicating feature flag selection bias | Low | High | Monitor campaigns_activation_rate in both variants during the first week; flag Engineering Lead if >5% [fictional] difference is observed |
| Merchant quality differs between control and treatment due to non-random assignment | Low | High | Confirm randomization is at the merchant_id level and is deterministic (same merchant always sees the same variant) |
Monitoring Plan
- Check guardrail metrics (unsubscribe rate, spam complaint rate) daily during the first week of the test via the send failure rate alert in the Campaigns monitoring dashboard
- Do not check primary metric significance until 30 days have elapsed to avoid peeking bias
- Rollback criteria: if the spam complaint rate exceeds 0.1% [fictional] in either variant within the first 7 days, pause the experiment and review template content and audience defaulting logic before continuing
Implementation Notes
- Feature flag name:
campaigns_guided_first_campaign_flow; variant assignment at merchant_id level; 50/50 split - First-campaign detection query:
SELECT COUNT(*) FROM campaigns WHERE merchant_id = ? AND status = 'sent'. treatment flow is shown if count = 0 - Instrumentation:
campaign_sentevent must includeexperiment_variantproperty (controlortreatment) for all sends during the test window;first_send_completedevent must also includeexperiment_variant experiment_variantproperty must be set at the time of flow entry, not at send time, to correctly attribute sends where the merchant returns later to complete their campaign
References
- Hypothesis document: “Pre-Populated Templates Drive First Campaign Sends for Non-Adopter Merchants” (Define phase, Feb 20, 2026)
- Figma: Campaigns creation flow v4 . control and treatment variants (internal)
- Instrumentation spec: Campaigns Guided First-Campaign Flow Events (Measure phase)