A/B Testing Basics: Run Experiments You Can Trust
Design, run, and read A/B tests correctly so you stop shipping changes based on noise.
Data & AnalyticsPDF · 12 pages· v1.0
4.3Design, run, and read A/B tests correctly so you stop shipping changes based on noise.
Data & AnalyticsPDF · 12 pages· v1.0
4.3Most A/B tests are run wrong: stopped early on a lucky day, sized by guesswork, judged on a metric that does not matter, and declared a winner when the result is pure noise. This guide teaches you to run experiments that actually tell you something, without a statistics degree. It is written for marketers, product managers, founders, and analysts who run tests on websites, emails, pricing, or product features. You will understand the concepts that matter in plain language: hypothesis, control vs variant, sample size, statistical significance, confidence, and statistical power. You will learn the full lifecycle: write a falsifiable hypothesis, pick one primary metric and guardrail metrics, calculate the sample size and duration before you start (the step everyone skips), randomize correctly, avoid peeking, and interpret the result honestly including what a non-significant result really means. The guide is blunt about the traps that produce fake wins: stopping when you are ahead, running too many variants and metrics, sample ratio mismatch, novelty effects, and the difference between statistical and practical significance. It includes a simple worked example with real numbers. The outcome: the judgment to design a sound test, the discipline not to fool yourself, and the literacy to call out a bad test when you see one.
No. The guide explains every concept in plain language and tells you which free sample-size and significance calculators to use. You will understand what the numbers mean, which is what matters, without doing the math by hand.
Anything with two versions and a measurable outcome: landing pages, email subject lines, pricing, checkout flows, button copy, product features. The principles are identical across all of them.
Because it tells you whether your test can finish in a reasonable time and stops you from peeking and stopping early. A test without a planned sample size and duration is the single biggest source of false wins.
It means you did not detect a difference, not that there is no difference. The guide explains the distinction and what to do next, which is often the most useful thing it teaches.
Read the full refund policy and trust & safety terms.