Ceteris LabInteractive Econometrics

Lesson 6

Covariance and correlation

Big question

How do we describe whether two variables move together?

Lesson progress

Complete checkpoints as you learn

0% complete0 checkpoint streak
Big question
Concept
Activity
Quiz

Learning objectives

  • Explain covariance and correlation in plain language.
  • Use covariance correctly in an interpretation.
  • Connect the lesson idea to a formula, graph, Python result, or real example.

Simple explanation

Covariance and correlation summarize co-movement. Correlation is easier to read because it is scaled between -1 and 1. A positive correlation means two variables tend to move in the same direction.

Key terms

Covariance
A measure of whether two variables move above or below their averages together.
Correlation
A standardized measure of linear association between -1 and 1.
Positive association
Higher values of one variable tend to come with higher values of another.
Negative association
Higher values of one variable tend to come with lower values of another.

Correlation

corr(x,y)=cov(x,y)sd(x)sd(y)corr(x,y) = \frac{cov(x,y)}{sd(x)sd(y)}

Example

If people with more education often have higher wages, education and wage may have a positive correlation.

Correlation in pandas

1import pandas as pd2 3df = pd.DataFrame({4    "wage": [18, 22, 30, 35],5    "education": [12, 14, 16, 18]6})7 8print(df["wage"].corr(df["education"]))

Checkpoint activity

Pause and explain this lesson's main idea in your own words before moving forward.

Try it yourself

Write one plain-English sentence explaining the main idea from this lesson.

Common mistakes

Check these before you move on.

A regression coefficient describes a pattern unless the assumptions or research design support a causal interpretation.

Quick quiz

What is the largest possible correlation?

Key takeaway

Correlation is useful for describing patterns, but it does not by itself prove cause and effect.