Research
Share

Design & Analyze A/B Tests

Design rigorous A/B tests, run them correctly, and analyze results for clear decisions.

2-6 weeks per test
8 steps

Overview

A complete A/B testing playbook from hypothesis to decision. Covers test design, sample size, implementation, monitoring, statistical analysis, and making decisions. Ensures your experiments are valid and actionable.

Prerequisites

  • Clear hypothesis to test
  • A/B testing infrastructure in place
  • Sufficient traffic for statistical power
  • Metrics tracking set up

Steps

1

Form Your Hypothesis

1-2 hours

Create a clear, testable hypothesis with expected outcome.

Prompts to use:

Deliverables:

  • Hypothesis statement
  • Expected outcome and direction
  • Primary metric to move
  • Rationale for the change

Tips:

  • Use format: "If we [change], then [metric] will [direction] because [reason]"
  • Be specific about expected magnitude
  • One hypothesis per test
  • Base hypothesis on research or data, not just opinion
2

Design the Test

2-3 hours

Define control, treatment, metrics, and test parameters.

Prompts to use:

Deliverables:

  • Control and treatment definitions
  • Primary and secondary metrics
  • Guardrail metrics
  • Target population
  • Exclusion criteria

Tips:

  • Change one variable at a time
  • Define primary metric upfront (don't change mid-test)
  • Include guardrail metrics to catch negative effects
  • Document what exactly differs between variants
3

Calculate Sample Size

1-2 hours

Determine how many users and how long to run the test.

Prompts to use:

Deliverables:

  • Required sample size
  • Expected test duration
  • Statistical power (typically 80%)
  • Significance level (typically 95%)
  • Minimum detectable effect

Tips:

  • Use a sample size calculator (Evan Miller, Optimizely)
  • Plan for at least 1 full week to capture weekly patterns
  • Don't peek and stop early when you see significance
  • Account for your baseline conversion rate
4

Implement the Test

1-3 days

Build variants and set up the experiment infrastructure.

Deliverables:

  • Variants implemented
  • Tracking verified
  • Randomization working
  • QA completed

Tips:

  • Verify tracking fires correctly for both variants
  • Check randomization is truly random
  • QA both variants thoroughly
  • Test on multiple devices and browsers
5

Launch & Monitor

1-4 weeks

Start the test and monitor for issues.

Deliverables:

  • Test launched
  • Daily monitoring in place
  • Sample ratio check
  • No major issues detected

Tips:

  • Check sample ratio mismatch (should be close to 50/50)
  • Monitor guardrail metrics for red flags
  • Don't peek at primary metric results
  • Have a plan to stop if something breaks
6

Analyze Results

2-4 hours

Conduct statistical analysis when test reaches sample size.

Deliverables:

  • Statistical significance assessment
  • Effect size and confidence interval
  • Segment analysis
  • Guardrail metric results

Tips:

  • Wait for full sample size before analyzing
  • Report confidence intervals, not just p-values
  • Check for novelty effects (early vs late results)
  • Segment results to understand who was affected
7

Make a Decision

1-2 hours

Decide whether to ship, iterate, or abandon based on results.

Prompts to use:

Deliverables:

  • Ship / Don't ship decision
  • Rationale documented
  • Learning captured
  • Next steps defined

Tips:

  • Statistically significant ≠ practically significant
  • Consider effect size, not just significance
  • Check guardrail metrics before shipping
  • Document decision rationale for future reference
8

Document Learnings

1-2 hours

Archive results and share learnings with the team.

Prompts to use:

Deliverables:

  • Test documentation
  • Results summary
  • Learnings for future tests
  • Shared with team

Tips:

  • Document regardless of outcome (negative results are valuable)
  • Include what you'd do differently
  • Share learnings broadly
  • Build organizational knowledge base

Workflows & playbooks by email

Weekly step-by-step guides, chained prompts, and AI UX resources on Substack - no spam, unsubscribe anytime.

Subscribe on Substack