Statistical Significance Calculator

Group A (Control)

Sample Size (e.g., Visitors): Conversions (e.g., Sign-ups):

Group B (Variant)

Sample Size (e.g., Visitors): Conversions (e.g., Sign-ups):

Results:

Conversion Rate A: –

Conversion Rate B: –

Difference in Conversion Rates: –

Z-score: –

P-value: –

Statistical Significance: –

function erf(x) { // constants for Abramowitz and Stegun approximation (7.1.26) var p = 0.3275911; var a1 = 0.254829592; var a2 = -0.284496736; var a3 = 1.421413741; var a4 = -1.453152027; var a5 = 1.061405429; // Save the sign of x var sign = 1; if (x < 0) { sign = -1; x = -x; } // A&S formula 7.1.26 var t = 1.0 / (1.0 + p * x); var y = 1.0 – (((((a5 * t + a4) * t) + a3) * t + a2) * t + a1) * t * Math.exp(-x * x); return sign * y; } function calculateSignificance() { var sampleSizeA = parseFloat(document.getElementById('sampleSizeA').value); var conversionsA = parseFloat(document.getElementById('conversionsA').value); var sampleSizeB = parseFloat(document.getElementById('sampleSizeB').value); var conversionsB = parseFloat(document.getElementById('conversionsB').value); var resultOutput = document.getElementById('resultOutput'); var crA_display = document.getElementById('crA_display'); var crB_display = document.getElementById('crB_display'); var crDiff_display = document.getElementById('crDiff_display'); var zScore_display = document.getElementById('zScore_display'); var pValue_display = document.getElementById('pValue_display'); var significance_display = document.getElementById('significance_display'); // Input validation if (isNaN(sampleSizeA) || isNaN(conversionsA) || isNaN(sampleSizeB) || isNaN(conversionsB) || sampleSizeA <= 0 || sampleSizeB <= 0 || conversionsA < 0 || conversionsB < 0) { resultOutput.innerHTML = '

Error: Please enter valid positive numbers for sample sizes and non-negative numbers for conversions.

'; return; } if (conversionsA > sampleSizeA || conversionsB > sampleSizeB) { resultOutput.innerHTML = '

Error: Conversions cannot exceed sample size.

'; return; } // Calculate Conversion Rates var crA = conversionsA / sampleSizeA; var crB = conversionsB / sampleSizeB; var crDiff = crB – crA; // Calculate Pooled Proportion var p_pooled = (conversionsA + conversionsB) / (sampleSizeA + sampleSizeB); // Calculate Standard Error var SE = Math.sqrt(p_pooled * (1 – p_pooled) * (1 / sampleSizeA + 1 / sampleSizeB)); // Calculate Z-score var Z = (crB – crA) / SE; // Calculate P-value (two-tailed test) using erf function // p-value = 1 – erf(|Z| / sqrt(2)) var p_value = 1 – erf(Math.abs(Z) / Math.sqrt(2)); // Determine significance var significanceText = "; var significanceColor = '#007bff'; // Default color if (p_value < 0.001) { significanceText = 'Highly significant (p < 0.001)'; significanceColor = '#28a745'; // Green for strong significance } else if (p_value < 0.01) { significanceText = 'Very significant (p < 0.01)'; significanceColor = '#28a745'; } else if (p_value < 0.05) { significanceText = 'Statistically significant (p < 0.05)'; significanceColor = '#28a745'; } else if (p_value < 0.10) { significanceText = 'Marginally significant (p < 0.10)'; significanceColor = '#ffc107'; // Yellow for marginal } else { significanceText = 'Not statistically significant (p ≥ 0.10)'; significanceColor = '#dc3545'; // Red for not significant } // Display results crA_display.innerHTML = 'Conversion Rate A: ' + (crA * 100).toFixed(2) + '%'; crB_display.innerHTML = 'Conversion Rate B: ' + (crB * 100).toFixed(2) + '%'; crDiff_display.innerHTML = 'Difference in Conversion Rates: ' + (crDiff * 100).toFixed(2) + '%'; zScore_display.innerHTML = 'Z-score: ' + Z.toFixed(4); pValue_display.innerHTML = 'P-value: ' + p_value.toFixed(6); significance_display.innerHTML = 'Statistical Significance: ' + significanceText + ''; }

Understanding Statistical Significance

Statistical significance is a fundamental concept in research, A/B testing, and data analysis. It helps us determine whether an observed difference between two groups or conditions is likely due to a real effect or simply due to random chance.

What Does it Mean?

When we say a result is "statistically significant," it means that the probability of observing such a difference (or an even more extreme one) if there were truly no difference between the groups is very low. In simpler terms, it suggests that the observed effect is probably not just a fluke.

Key Concepts:

Null Hypothesis (H0): This is the default assumption that there is no difference or no effect between the groups being compared. For example, in an A/B test, the null hypothesis would be that the conversion rate of Variant B is the same as Control A.
Alternative Hypothesis (H1): This is the hypothesis that there *is* a difference or an effect. For example, Variant B has a different conversion rate than Control A.
P-value: The p-value is the probability of obtaining test results at least as extreme as the observed results, assuming that the null hypothesis is true. A small p-value (typically less than 0.05) suggests that the observed data is inconsistent with the null hypothesis, leading us to reject the null hypothesis in favor of the alternative hypothesis.
Significance Level (Alpha, α): This is a threshold chosen by the researcher (commonly 0.05 or 5%). If the p-value is less than or equal to the significance level, the result is considered statistically significant.
Confidence Level: This is often expressed as 1 – α. For a 0.05 significance level, the confidence level is 95%. It represents the probability that if you were to repeat the experiment many times, you would get the same conclusion (rejecting or failing to reject the null hypothesis).

How the Calculator Works (Z-test for Two Proportions):

This calculator uses a Z-test for two population proportions, which is suitable for comparing the conversion rates (or success rates) of two independent groups, like in an A/B test. Here's a simplified breakdown of the steps:

Conversion Rates (CR): It first calculates the conversion rate for each group (Conversions / Sample Size).
Pooled Proportion: It then calculates a pooled proportion, which is an overall success rate across both groups, assuming the null hypothesis is true.
Standard Error: This measures the variability of the difference between the two sample proportions.
Z-score: The Z-score quantifies how many standard errors the observed difference in conversion rates is away from zero (the expected difference under the null hypothesis). A larger absolute Z-score indicates a greater difference.
P-value: Finally, the Z-score is used to calculate the p-value. This tells us the probability of seeing a difference as large or larger than what was observed, purely by chance, if there was no real difference between the groups.

Interpreting the Results:

P-value < 0.05 (e.g., 0.03): The result is statistically significant at the 95% confidence level. This means there's less than a 5% chance that the observed difference occurred by random chance. You would typically reject the null hypothesis and conclude that there is a real difference between the groups.
P-value < 0.01 (e.g., 0.008): The result is very significant at the 99% confidence level. Even stronger evidence against the null hypothesis.
P-value ≥ 0.05 (e.g., 0.15): The result is not statistically significant at the 95% confidence level. This means there's a greater than 5% chance that the observed difference could have occurred by random chance. You would fail to reject the null hypothesis, meaning you don't have enough evidence to conclude a real difference.

Example Scenario:

Imagine you are running an A/B test for a new website design. Group A (Control) sees the old design, and Group B (Variant) sees the new design. You want to see if the new design increases sign-ups.

Group A: 10,000 visitors, 100 sign-ups (1.00% conversion rate)
Group B: 10,000 visitors, 120 sign-ups (1.20% conversion rate)

Using the calculator with these inputs:

Conversion Rate A: 1.00%
Conversion Rate B: 1.20%
Difference: 0.20%
Z-score: Approximately 1.8257
P-value: Approximately 0.0679

In this example, since the p-value (0.0679) is greater than 0.05, the result is not statistically significant at the 95% confidence level. While Group B had a higher conversion rate, this calculator suggests that the observed difference could reasonably be due to random chance, and you don't have strong enough evidence to conclude that the new design is definitively better.

If, however, Group B had 130 sign-ups (1.30% conversion rate), the p-value would be approximately 0.013, which is less than 0.05. In that case, you would conclude that the new design is statistically significantly better.

Important Considerations:

Practical vs. Statistical Significance: A result can be statistically significant but not practically significant. A tiny difference in conversion rates might be statistically significant with a very large sample size, but might not be meaningful enough to warrant a change.
Sample Size: Larger sample sizes generally lead to more precise estimates and a higher chance of detecting a true difference if one exists. Ensure your sample size is adequate for your desired effect size.
One-tailed vs. Two-tailed Tests: This calculator performs a two-tailed test, meaning it looks for a difference in either direction (Group B is better OR worse than Group A). If you only care if Group B is *better* (and not worse), a one-tailed test would be used, which would yield a p-value half of the two-tailed p-value.