DS-100 · Data Speak Louder than Words
An introduction to reasoning with data — Python, statistics, and visualization used to make arguments about the real world that are harder to dismiss than words. No prior programming experience expected.
DS-100 is where a lot of people discover that they can do this. It’s an introductory data science course with no programming prerequisite, built around one conviction: data speak louder than words, and learning to listen to them — carefully, skeptically, honestly — is a skill worth having no matter what you study.
What the course is really about
We work with three kinds of thinking at once:
- Critical thinking — is this claim actually supported by the data behind it?
- Inferential thinking — what can (and can’t) we conclude from a sample?
- Computational thinking — how do we make a computer do the tedious parts?
You’ll learn Python from scratch, work in Jupyter notebooks from day one, and practice on real, messy, socially relevant data — public health, housing, transportation, education. By the end you’ll be making and testing hypotheses, building visualizations that tell the truth, and reaching conclusions you can defend out loud.
How it runs
The course uses a flipped, GenAI-integrated model. Each week you explore the upcoming concepts with an AI learning partner (a structured GenAI Exploration, or GAIE), then class time goes to the hard parts: verification, application, and the questions the AI couldn’t answer well. Assessments are individual and mostly AI-free — the explorations are practice; the verification is yours alone.
If you’ve heard that this course “uses AI,” that’s true, but the point isn’t the AI. The point is that you leave able to do the work yourself and to say, precisely and honestly, where a machine helped.
Who it’s for
Anyone. DS-100 is a BU Hub course (Social Inquiry, Digital/Multimedia Expression, Research and Information Literacy) designed for students from any major. If you’ve never written a line of code, you’re exactly who it was written for.
Course materials
-
DS-100 Syllabus Syllabus
The full course contract — what DS-100 covers, how it's graded, and how to succeed.
-
JupyterHub: Getting Started Guide
How to log in to the course JupyterHub, clone your assignment repos, and make sure your work is backed up.
-
Missed Something? The Makeup Policy Policy
What to do when life happens — the excused-absence makeup procedure, step by step.
-
Style Guide for Written Work Reference
Formatting, citation, and AI-attribution requirements for all written submissions — part of the communication score on rubric-graded work.
Datasets we use
-
Hollywood Actors — Box Office Self-hosted
Top 50 actors by total US box-office gross, with per-film averages and their biggest movie.
-
Maternal Smoking & Birth Weight Self-hosted
Birth weight, gestation, and maternal health for 1,174 mother-baby pairs from the Child Health and Development Studies — the classic causality-vs-correlation dataset.
-
World Billionaires (2026) Self-hosted
A 2026 snapshot of the world's billionaires — name, net worth, industry, and citizenship. Great for rankings, group-bys, and skeptical questions about wealth data.
-
Bluebikes Trips — Sample (Sept 2021) Self-hosted
A workable sample of Boston Bluebikes bike-share trips from September 2021 — start/end stations, timestamps, and rider type.
-
Bluebikes Stations (May 2026) Self-hosted
Every Bluebikes station — name, coordinates, docks, and municipality. The join partner for the trips data.
-
Boston 311 Service Requests (2025) Self-hosted
A year of Bostonians asking the city for help — every 311 service request from 2025, with type, neighborhood, and resolution timestamps.
-
Boston 311 Requests (live portal) Analyze Boston
The live, continuously updated 311 dataset on Analyze Boston — for when the 2025 snapshot isn't enough.
-
Boston Building Energy & Water Metrics (2025) Self-hosted
Energy and water use reported by Boston's large buildings under BERDO — sustainability data with policy teeth.
-
NBA Salaries (2015–16) Self-hosted
Player name, team, position, and salary for the 2015–16 NBA season — histograms with a long right tail.
-
Old Faithful Eruptions Self-hosted
Eruption durations and waiting times for the Old Faithful geyser — the classic two-cluster scatter plot.
-
US Presidential Birth Years Self-hosted
Birth data for US presidents — a tiny table for early table operations and date arithmetic.
-
RentSmart Boston Self-hosted
Housing violations, complaints, and inspections for Boston rental properties — civic data with real housing-justice questions in it.
-
State SAT Averages (2014) Self-hosted
Average SAT scores and participation rates by US state — the textbook example of a lurking variable.
-
San Francisco City Salaries (2015) Self-hosted
Compensation for every San Francisco city employee in 2015 — job titles, salaries, overtime, and benefits.
-
US Skyscrapers Self-hosted
Name, city, height, and completion year for notable US skyscrapers — heights, eras, and city skylines in one table.
-
Daily Temperatures Self-hosted
Long-run daily temperature observations — seasonality, smoothing, and long-term trends.
-
Top Grossing Movies (2017) Self-hosted
Highest-grossing films with unadjusted and inflation-adjusted gross — a lesson about units hiding inside a fun dataset.
-
United Flight Delays (Summer 2015) Self-hosted
Departure delays for United flights out of SFO — thousands of rows for sampling and the law of averages.
-
World Population by Year Self-hosted
Annual world population estimates — the simplest possible time series for first plots and growth rates.
From the classroom

Planned terms
- Summer 2026 Syllabus PDF
Past terms
- Spring 2026 Syllabus PDF
- Fall 2025 Syllabus PDF
- Spring 2025 Syllabus PDF
- Fall 2024
- Spring 2024
- Fall 2023 Syllabus PDF
- Spring 2023 Syllabus PDF
- Fall 2022 Syllabus PDF
- Spring 2022 Syllabus PDF
- Fall 2021 Syllabus PDF