In the world of data and analytics, understanding how two variables move together is fundamental.
For example โ
- Do higher temperatures increase ice cream sales?
- Does more advertising lead to higher revenue?
- Do fertilizer inputs improve crop yield?
These relationships are captured by a powerful statistical concept called Correlation.
๐ What is Correlation?
Correlation measures the strength and direction of a linear relationship between two variables.
In simple terms:
Correlation tells us how changes in one variable are associated with changes in another.
For instance:
- As temperature rises, ice cream sales also rise โ positive correlation.
- As fuel price increases, car usage decreases โ negative correlation.
- The number of pens owned and height of a person โ no correlation.
๐งฎ The Correlation Coefficient (r)
The degree of correlation is measured using the Pearsonโs correlation coefficient, denoted by r.

Where:
- X and Y = variables
- n = number of observations
๐ Interpretation of r
| Value of r | Relationship | Strength |
|---|---|---|
| +1 | Perfect positive | Very strong |
| 0.7 to 0.9 | Strong positive | Strong |
| 0.3 to 0.7 | Moderate positive | Moderate |
| 0 | No correlation | None |
| -0.3 to -0.7 | Moderate negative | Moderate |
| -0.7 to -0.9 | Strong negative | Strong |
| -1 | Perfect negative | Very strong |
๐ก๏ธ Example 1: Positive Correlation
Letโs look at a simple dataset:
| Hours Studied (X) | Marks Scored (Y) |
|---|---|
| 2 | 40 |
| 4 | 50 |
| 6 | 60 |
| 8 | 70 |
| 10 | 80 |
We can calculate the Pearsonโs r using the formula.
Step 1: Compute intermediate values

Step 2: Apply the formula

โ
Result: r = +1, indicating a perfect positive correlation.
As study hours increase, marks also increase in a perfectly linear way.
๐ง Example 2: Negative Correlation
| Temperature (ยฐC) | Hot Chocolate Sales |
|---|---|
| 10 | 90 |
| 15 | 80 |
| 20 | 60 |
| 25 | 40 |
| 30 | 30 |
If you compute rrr, youโll find r โ -0.96
โ A strong negative correlation โ as temperature rises, sales fall.
๐ช Example 3: No Correlation
| Shoe Size | Intelligence Score |
|---|---|
| 5 | 110 |
| 6 | 120 |
| 7 | 115 |
| 8 | 118 |
| 9 | 116 |
If we calculate r, it will be close to 0, implying no relationship.
The size of shoes does not determine intelligence!
๐ Scatter Diagram (Graphical Representation)
A scatter plot is the easiest way to visualize correlation.
- Positive correlation: Points rise from bottom left to top right.
- Negative correlation: Points fall from top left to bottom right.
- No correlation: Points are scattered randomly.

๐ธ Types of Correlation
| Type | Description | Example |
|---|---|---|
| Positive Correlation | Both variables move in the same direction | Height & Weight |
| Negative Correlation | One variable increases, the other decreases | Price & Demand |
| Zero Correlation | No relationship | Shoe size & IQ |
| Linear Correlation | Data forms a straight-line relationship | Study time & Marks |
| Non-linear Correlation | Relationship curves (not straight) | Stress & Productivity |
๐ Other Correlation Measures
| Measure | When Used | Notes |
|---|---|---|
| Pearsonโs r | Both variables are continuous & normally distributed | Most common |
| Spearmanโs rank (ฯ) | Data is ordinal or not normally distributed | Based on ranks |
| Kendallโs tau (ฯ) | Small samples or tied ranks | Non-parametric |
๐ Example 4: Spearmanโs Rank Correlation (ฯ)
| Student | Math Rank | Science Rank |
|---|---|---|
| A | 1 | 2 |
| B | 2 | 1 |
| C | 3 | 3 |
| D | 4 | 4 |
| E | 5 | 5 |
Step 1: Compute difference in ranks (d)
| Student | Math Rank | Science Rank | d | dยฒ |
|---|---|---|---|---|
| A | 1 | 2 | -1 | 1 |
| B | 2 | 1 | 1 | 1 |
| C | 3 | 3 | 0 | 0 |
| D | 4 | 4 | 0 | 0 |
| E | 5 | 5 | 0 | 0 |

Step 2: Apply formula:

โ Result: Strong positive correlation (ฯ = 0.9)
๐ Key Points to Remember
- Correlation does not imply causation.
(E.g., ice cream sales and drowning incidents are correlated due to hot weather โ not cause-effect.) - Correlation measures association, not influence.
- Outliers can significantly distort the correlation coefficient.
- Always visualize with a scatter plot before interpreting results.
๐ก Real-World Applications
- Business: Sales vs. marketing spend
- Agriculture: Rainfall vs. crop yield
- Economics: GDP vs. employment rate
- Health: Exercise vs. body mass index (BMI)
- Education: Study time vs. exam performance
๐ Further Reading
- Field, A. (2022). Discovering Statistics Using SPSS. Sage Publications.
- Gujarati, D. N. (2020). Basic Econometrics. McGraw Hill.
- Jim Frost, Statistics by Jim โ Correlation Explained Simply
- NIST e-Handbook: https://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm









Leave a comment