A/B Test Significance Calculator
Enter the visitors and conversions for your control and variant. The calculator computes the p-value, statistical confidence, and whether your result is significant — so you know if it's safe to call a winner.
Advertisement
Calculator
See your A/B Test Significance Calculator results
Enter your email to unlock results — free forever.
No spam, ever. Unsubscribe at any time.
Advertisement
Formula
z = (p₂ − p₁) / √(p̄(1−p̄)(1/n₁ + 1/n₂))
p₁ and p₂ are the control and variant conversion rates. p̄ is the pooled conversion rate across both groups. n₁ and n₂ are the visitor counts. The z-score is converted to a confidence percentage using the standard normal CDF.
How to use the A/B Test Significance Calculator
- 1
Enter your control visitors
- 2
Enter your control conversions
- 3
Enter your variant visitors
- 4
Enter your variant conversions
- 5
Read your results instantly
Results update in real time as you type.
Advertisement
What statistical significance actually means
A 95% confidence result does not mean there's a 95% chance the variant is better. It means: if there were truly no difference between control and variant, you'd see a result this extreme or more only 5% of the time by random chance.
This is an important nuance. Significance tells you about the reliability of your measurement process, not the probability that your variant is truly better. A result can be statistically significant but practically meaningless if the effect size is tiny.
P-value vs. confidence: what to report
The p-value is the probability of observing results at least as extreme as yours, assuming no true difference. A p-value of 0.03 means there's a 3% chance this result is random noise — not a 97% chance the variant wins.
Most stakeholders find confidence percentages easier to communicate than p-values. '97% confidence' reads more intuitively than 'p = 0.03'. Both convey the same information — confidence = (1 − p-value) × 100.
Tips & Insights
Significance ≠ practical significance
A 0.1% lift at 99% confidence is statistically significant but probably not worth shipping. Always pair significance with effect size and business impact.
Don't cherry-pick metrics
If you test 10 metrics, one will likely appear significant by chance at 95% confidence. Pre-specify your primary metric before the test starts.
Worked Examples
Checkout button color test
Variant rate is 3.84% vs control 3.2% — a 20% relative lift at ~95% confidence. Borderline significant; worth running longer to confirm.
Advertisement
Frequently Asked Questions
What confidence level should I use to call a winner?
95% is the standard minimum for most business decisions. For high-stakes or irreversible changes (major redesigns, pricing changes), use 99%. For low-risk, easily-reversible changes, some teams accept 90%.
My variant is winning but confidence is only 80% — should I ship it?
That depends on risk tolerance. At 80% confidence you have a 20% chance the result is noise. If rolling out the variant is cheap and reversible, and the potential upside is large, some teams accept this. If the change is costly to implement or hard to reverse, wait for higher confidence.
What if my sample sizes are very unequal?
The formula handles unequal sample sizes correctly. However, very unequal splits (e.g. 90/10) reduce statistical power — you'd need more total traffic to reach significance than a balanced 50/50 split.
Advertisement