Ceteris LabInteractive Econometrics

Lesson 12

Practice regression project

Big question

How can students complete a small regression analysis from question to interpretation?

Lesson progress

Complete checkpoints as you learn

0% complete0 checkpoint streak
Big question
Concept
Activity
Quiz

Learning objectives

  • Explain practice regression project in plain language.
  • Use research question correctly in an interpretation.
  • Connect the lesson idea to a formula, graph, Python result, or real example.

Simple explanation

A practice project turns the module into a repeatable workflow: choose a question, define y and x, inspect the data, draw a scatter plot, estimate the simple regression, interpret the slope and R-squared, and write a careful conclusion.

Key terms

Research question
A focused question that can be connected to measured variables.
Reproducible workflow
A set of steps that another student can rerun and verify.
Regression report
A short explanation of the question, data, model, results, and caution.
Causal limitation
A reason the regression may describe association but not prove cause and effect.

Project model

wage^=10.10+2.33education\widehat{wage} = -10.10 + 2.33\,education

This is the fitted equation from the local classroom wage sample.

Example

A student can use WAGE1, CEOSAL2, VOTE1, SLEEP75, BWGHT, 401K, MEAP93, or the local wage sample to practice a one-variable regression.

Interactive visual

Mini project checklist

Students complete the full path from question to chart, regression, and interpretation.

wage_sample.csv

Project steps

Choose one outcome y and one explanatory variable x.

Open the dataset card and inspect the variable definitions.

Make a scatter plot before estimating the model.

Estimate the simple regression in Python.

Interpret the slope, R-squared, and one limitation.

Good starter datasets

wage_sampleWAGE1CEOSAL2VOTE1SLEEP75BWGHT401KMEAP93

Reusable project starter

1import pandas as pd2import statsmodels.api as sm3import matplotlib.pyplot as plt4 5df = pd.read_csv("wage_sample.csv")6y_name = "wage"7x_name = "education"8 9plt.scatter(df[x_name], df[y_name])10plt.xlabel(x_name)11plt.ylabel(y_name)12plt.title(f"{y_name} and {x_name}")13plt.show()14 15y = df[y_name]16X = sm.add_constant(df[x_name])17model = sm.OLS(y, X).fit()18 19print(model.summary())20print(f"Interpretation: one more unit of {x_name} is associated with",21      round(model.params[x_name], 2),22      f"more units of {y_name}, on average in this sample.")

Python walkthrough

  1. 1Students can change y_name and x_name to practice with another two-variable question.
  2. 2The scatter plot comes before the regression so students see the relationship before summarizing it numerically.
  3. 3The final print statement nudges students to turn the coefficient into a sentence.

Live notebook

Run this lesson as a notebook

Open an editable notebook cell-by-cell, run Python in the browser, and download the `.ipynb` file for later.

Interactive activity

Dataset explorer

Explore wage_sample.csv

Choose y and x, inspect variables, and generate a simple regression summary.

Open Dataset Library

Variable dictionary

wage

Hourly wage

education

Years of education

experience

Years of labor market experience

Regression output

Intercept: -10.10

Slope: 2.33

R-squared: 0.949

wage and education

Each dot is one observation. The fitted line summarizes the relationship between education and wage.

educationwage
Slope: 2.33
Intercept: -10.10
R-squared: 0.949

Try it yourself

Write one plain-English sentence explaining the main idea from this lesson.

Common mistakes

Check these before you move on.

A regression coefficient describes a pattern unless the assumptions or research design support a causal interpretation.

Quick quiz

Which sequence best matches a careful simple regression project?

Quick quiz

What should a student include in a strong final interpretation?

Key takeaway

A strong regression project is not just output; it is a question, a graph, a fitted model, a careful interpretation, and an honest limitation.