Ceteris LabInteractive Econometrics

Lesson 17

Descriptive statistics in Python

Big question

How can Python quickly summarize a dataset?

Lesson progress

Complete checkpoints as you learn

0% complete0 checkpoint streak
Big question
Concept
Activity
Quiz

Learning objectives

  • Explain descriptive statistics in python in plain language.
  • Use describe correctly in an interpretation.
  • Connect the lesson idea to a formula, graph, Python result, or real example.

Simple explanation

Descriptive statistics help you check typical values, spread, and possible surprises. In Python, pandas can calculate these summaries for many variables at once.

Key terms

Describe
A pandas method that reports common summary statistics.
Minimum
The smallest value.
Maximum
The largest value.
Count
The number of non-missing observations.

Example

Before studying wages, check whether wages and education have reasonable minimums, maximums, and averages.

Summarize wage data

1import pandas as pd2 3df = pd.read_csv("wage_sample.csv")4print(df[["wage", "education", "experience"]].describe())

Live notebook

Run this lesson as a notebook

Open an editable notebook cell-by-cell, run Python in the browser, and download the `.ipynb` file for later.

Checkpoint activity

Pause and explain this lesson's main idea in your own words before moving forward.

Try it yourself

Write one plain-English sentence explaining the main idea from this lesson.

Common mistakes

Check these before you move on.

A regression coefficient describes a pattern unless the assumptions or research design support a causal interpretation.

Quick quiz

Why should we summarize data before modeling?

Key takeaway

Descriptive statistics are a first quality check and a first story about the data.