StatisticscorrelationPearson rlinear relationship

Pearson Correlation Coefficient Calculator

Enter the sample size and summary statistics (sums of x, y, xy, x², and y²) to compute Pearson's correlation coefficient r. Measures the strength and direction of the linear relationship between two continuous variables.

Advertisement

Calculator

See your Pearson Correlation Coefficient Calculator results

Enter your email to unlock results — free forever.

or

No spam, ever. Unsubscribe at any time.

Advertisement

Formula

r = (nΣxy − ΣxΣy) / √[(nΣx² − (Σx)²)(nΣy² − (Σy)²)]

This is the computational formula for Pearson's r. It uses the cross-product sum (Σxy), individual variable sums (Σx, Σy), sums of squares (Σx², Σy²), and sample size n. Multiply n by Σxy, then subtract the product of Σx and Σy. Divide by the square root of the product of the variance terms for x and y. The result is always between −1 and +1.

How to use the Pearson Correlation Coefficient Calculator

  1. 1

    Enter your sample size (n)

  2. 2

    Enter your σx (sum of x)

  3. 3

    Enter your σy (sum of y)

  4. 4

    Enter your σxy (sum of x×y)

  5. 5

    Enter your σx² (sum of x²)

  6. 6

    Enter your σy² (sum of y²)

  7. 7

    Read your results instantly

    Results update in real time as you type.

Advertisement

Interpreting Pearson's r

Pearson's r ranges from −1 to +1. An r of +1 indicates a perfect positive linear relationship — as x increases, y increases proportionally. An r of −1 indicates a perfect negative linear relationship — as x increases, y decreases proportionally. An r of 0 indicates no linear relationship.

Conventional benchmarks: |r| < 0.3 is weak, 0.3–0.7 is moderate, |r| > 0.7 is strong. But these are rough guidelines — a weak r may still be practically meaningful in large datasets or certain fields.

How to compute the summary statistics

To use this calculator, you need five summary statistics from your data. For n paired (x, y) observations: Σx is the sum of all x values; Σy is the sum of all y values; Σxy is the sum of each x multiplied by its paired y; Σx² is the sum of each x squared; Σy² is the sum of each y squared.

For example, with pairs (1,3), (2,4), (3,6), (4,7), (5,5): Σx=15, Σy=25, Σxy=1×3+2×4+3×6+4×7+5×5=83, Σx²=1+4+9+16+25=55, Σy²=9+16+36+49+25=135, n=5. These are the default values in this calculator.

Advertisement

Correlation vs. causation

A strong correlation between x and y does not mean x causes y. Both may be driven by a third variable (confounding), the relationship may be coincidental, or causality may run in the opposite direction. Always think critically about the mechanism before inferring causality from correlation.

The correlation coefficient only measures linear relationships. Two variables with a perfect U-shaped relationship (like y = x²) may have r near 0 even though they are strongly related. Always visualize your data in a scatter plot before relying on r alone.

Tips & Insights

r² tells you explained variance

The square of Pearson's r (called R², or the coefficient of determination) is the proportion of variance in y explained by x. If r=0.8, then r²=0.64 — x explains 64% of the variation in y.

r is sensitive to outliers

A single extreme outlier can dramatically change r. If your data contains outliers, use Spearman's rank correlation (ρ), which is robust to outliers and captures monotonic (not just linear) relationships.

Test significance before interpreting r

A small r can be statistically significant with a large n. Conversely, a large r may not be significant with a small n. Always test whether r differs significantly from zero using a t-test with n−2 degrees of freedom.

Worked Examples

Study hours vs. exam scores

n: 5Σx: 15Σy: 25Σxy: 83Σx²: 55Σy²: 135

r ≈ 0.959. Very strong positive correlation — students who study more tend to score much higher, with about 92% of the variation in scores explained by study hours.

Temperature vs. ice cream sales

n: 6Σx: 126Σy: 312Σxy: 6890Σx²: 2800Σy²: 17376

r ≈ 0.862. Strong positive correlation — hotter days correlate with higher ice cream sales. Note this is correlation, not direct causation.

Advertisement

Frequently Asked Questions

What does Pearson's r measure?

Pearson's r measures the strength and direction of the linear relationship between two continuous variables. It ranges from −1 (perfect negative) to +1 (perfect positive), with 0 indicating no linear relationship.

What is a strong correlation coefficient?

Common benchmarks: |r| < 0.3 is weak, 0.3–0.7 is moderate, |r| > 0.7 is strong. In physics, r > 0.99 is typical; in social sciences, r > 0.5 is often considered strong.

Does correlation imply causation?

No. Correlation measures association, not causation. Two variables can be strongly correlated because of a common cause, coincidence, or reverse causality. Establishing causation requires careful experimental design.

What is the difference between Pearson and Spearman correlation?

Pearson's r measures linear relationships and assumes normally distributed data. Spearman's ρ ranks the data first and measures monotonic relationships — it is non-parametric and more robust to outliers and non-normality.

How do I calculate Σxy from raw data?

For each paired observation (xᵢ, yᵢ), multiply xᵢ by yᵢ, then sum all products. For example, for pairs (2,4) and (3,6): Σxy = (2×4) + (3×6) = 8 + 18 = 26.

Advertisement

Related Calculators