Use this calculator to determine the minimum sample size required for each variation (control and experiment) in your A/B test to achieve statistically significant results.
Your current conversion rate for the control group (e.g., 10 for 10%).
The smallest relative improvement you want to be able to detect (e.g., 20 for a 20% relative lift). If your baseline is 10%, a 20% MDE means you want to detect a new rate of 12%.
The probability of a Type I error (false positive). Commonly 5% (0.05).
The probability of correctly detecting an effect if one exists. Commonly 80% (0.80).
Understanding A/B Test Sample Size
A/B testing is a powerful method for comparing two versions of a webpage, app feature, or marketing campaign to determine which one performs better. However, to ensure your test results are reliable and actionable, it's crucial to run the test with an adequate sample size.
Why is Sample Size Important?
Running an A/B test with too small a sample size can lead to inconclusive or misleading results. You might fail to detect a real improvement (Type II error) or incorrectly conclude that there's a difference when there isn't (Type I error). A sufficiently large sample size increases the statistical power of your test, making it more likely to detect a true effect if one exists.
Key Inputs Explained:
Baseline Conversion Rate: This is the current performance metric of your control group (e.g., the existing version of your webpage). It's usually expressed as a percentage. For example, if 100 out of 1000 visitors convert, your baseline conversion rate is 10%.
Minimum Detectable Effect (MDE): This is the smallest relative improvement in your conversion rate that you consider practically significant and want to be able to detect. For instance, if your baseline is 10% and you set an MDE of 20%, you're looking to detect a new conversion rate of at least 12% (10% * 1.20). A smaller MDE will require a larger sample size.
Statistical Significance (Alpha): Also known as the p-value threshold, this represents the probability of making a Type I error – concluding there's a difference when there isn't one (a false positive). A common standard is 5% (or 0.05), meaning you're willing to accept a 5% chance of a false positive. Lower significance levels (e.g., 1%) require larger sample sizes.
Statistical Power (1 – Beta): This is the probability of correctly detecting an effect if one truly exists (avoiding a Type II error, or a false negative). A common standard is 80%, meaning you want an 80% chance of detecting your MDE if it's real. Higher power levels (e.g., 90% or 95%) require larger sample sizes.
How to Interpret the Results:
The calculator will provide the "Sample Size Per Variation." This number represents the minimum number of unique visitors or users you need to expose to *each* version (control and experiment) of your A/B test. So, if the calculator suggests 1,000, you'll need 1,000 visitors for your control group and 1,000 visitors for your experiment group, totaling 2,000 visitors for the entire test.
Important Considerations:
Test Duration: Once you have your required sample size, consider your typical daily traffic to estimate how long it will take to reach that sample size. Avoid ending tests prematurely.
Practical Significance: While statistical significance is important, always consider if the detected effect is also practically significant for your business goals.
Multiple Variations: If you're running an A/B/C/D test, the sample size calculation becomes more complex, often requiring larger overall samples.
.calculator-container {
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
background-color: #f9f9f9;
padding: 20px;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0,0,0,0.1);
max-width: 700px;
margin: 20px auto;
color: #333;
}
.calculator-container h2 {
color: #0056b3;
text-align: center;
margin-bottom: 20px;
}
.calculator-container h3 {
color: #0056b3;
margin-top: 30px;
}
.calculator-container h4 {
color: #0056b3;
margin-top: 20px;
}
.form-group {
margin-bottom: 15px;
}
.form-group label {
display: block;
margin-bottom: 5px;
font-weight: bold;
color: #555;
}
.form-group input[type="number"] {
width: calc(100% – 22px);
padding: 10px;
border: 1px solid #ccc;
border-radius: 4px;
box-sizing: border-box;
font-size: 16px;
}
.form-group .description {
font-size: 0.85em;
color: #666;
margin-top: 5px;
}
button {
background-color: #007bff;
color: white;
padding: 12px 20px;
border: none;
border-radius: 4px;
cursor: pointer;
font-size: 18px;
width: 100%;
margin-top: 20px;
transition: background-color 0.3s ease;
}
button:hover {
background-color: #0056b3;
}
.calculator-result {
margin-top: 25px;
padding: 15px;
border: 1px solid #d4edda;
background-color: #e2f0e4;
border-radius: 4px;
font-size: 1.1em;
color: #155724;
text-align: center;
font-weight: bold;
}
.calculator-result.error {
border-color: #f5c6cb;
background-color: #f8d7da;
color: #721c24;
}
.calculator-article {
margin-top: 30px;
line-height: 1.6;
color: #444;
}
.calculator-article ul {
list-style-type: disc;
margin-left: 20px;
padding-left: 0;
}
.calculator-article li {
margin-bottom: 8px;
}
function calculateSampleSize() {
var baselineConversionRate = parseFloat(document.getElementById("baselineConversionRate").value);
var minimumDetectableEffect = parseFloat(document.getElementById("minimumDetectableEffect").value);
var statisticalSignificance = parseFloat(document.getElementById("statisticalSignificance").value);
var statisticalPower = parseFloat(document.getElementById("statisticalPower").value);
var resultDiv = document.getElementById("result");
resultDiv.innerHTML = "";
resultDiv.classList.remove("error");
// Input validation
if (isNaN(baselineConversionRate) || isNaN(minimumDetectableEffect) || isNaN(statisticalSignificance) || isNaN(statisticalPower)) {
resultDiv.innerHTML = "Please enter valid numbers for all fields.";
resultDiv.classList.add("error");
return;
}
if (baselineConversionRate = 100) {
resultDiv.innerHTML = "Baseline Conversion Rate must be between 0.01% and 99.99%.";
resultDiv.classList.add("error");
return;
}
if (minimumDetectableEffect <= 0) {
resultDiv.innerHTML = "Minimum Detectable Effect must be greater than 0%.";
resultDiv.classList.add("error");
return;
}
if (statisticalSignificance = 100) {
resultDiv.innerHTML = "Statistical Significance (Alpha) must be between 0.01% and 99.99%.";
resultDiv.classList.add("error");
return;
}
if (statisticalPower = 100) {
resultDiv.innerHTML = "Statistical Power (1 – Beta) must be between 0.01% and 99.99%.";
resultDiv.classList.add("error");
return;
}
// Convert percentages to decimals
var p1 = baselineConversionRate / 100;
var mde = minimumDetectableEffect / 100;
var alpha = statisticalSignificance / 100;
var power = statisticalPower / 100;
// Calculate expected conversion rate (p2)
var p2 = p1 * (1 + mde);
if (p2 >= 1) { // Ensure p2 doesn't exceed 100%
resultDiv.innerHTML = "The expected conversion rate (Baseline + MDE) exceeds 100%. Please adjust MDE or Baseline.";
resultDiv.classList.add("error");
return;
}
if (p2 <= 0) { // Ensure p2 is positive
resultDiv.innerHTML = "The expected conversion rate (Baseline + MDE) is too low. Please adjust MDE or Baseline.";
resultDiv.classList.add("error");
return;
}
// Z-scores for common alpha and power values
// Z_alpha is for alpha/2 (two-tailed test)
var Z_alpha;
if (alpha === 0.05) Z_alpha = 1.96;
else if (alpha === 0.10) Z_alpha = 1.645;
else if (alpha === 0.01) Z_alpha = 2.576;
else {
resultDiv.innerHTML = "For precise results, please use common significance levels like 1%, 5%, or 10%.";
resultDiv.classList.add("error");
return;
}
// Z_beta is for power (one-tailed)
var Z_beta;
if (power === 0.80) Z_beta = 0.84;
else if (power === 0.90) Z_beta = 1.28;
else if (power === 0.95) Z_beta = 1.645;
else {
resultDiv.innerHTML = "For precise results, please use common power levels like 80%, 90%, or 95%.";
resultDiv.classList.add("error");
return;
}
// Sample size per group using the formula for comparing two proportions
// n = (Z_alpha/2 + Z_beta)^2 * (p1*(1-p1) + p2*(1-p2)) / (p2-p1)^2
var numerator = Math.pow(Z_alpha + Z_beta, 2) * (p1 * (1 – p1) + p2 * (1 – p2));
var denominator = Math.pow(p2 – p1, 2);
if (denominator === 0) {
resultDiv.innerHTML = "The Minimum Detectable Effect is too small or zero, resulting in no difference between baseline and expected conversion rates. Please increase MDE.";
resultDiv.classList.add("error");
return;
}
var sampleSizePerVariation = Math.ceil(numerator / denominator);
resultDiv.innerHTML = "Sample Size Per Variation: " + sampleSizePerVariation.toLocaleString() + " visitors";
resultDiv.innerHTML += "Total Sample Size (Control + Experiment): " + (sampleSizePerVariation * 2).toLocaleString() + " visitors";
}