Course Materials
Unit 1: Overview
Reading
- Yarkoni and Westfall (2017) paper
- James et al. (2023) Chapter 2, pp 15 - 42
Slide decks
Videos
Lecture 1: An Introductory Framework ~ 9 mins
Lecture 3: Key Terminology in Context ~ 11 mins
Lab Materials
- None this week
Application Assignment
- No assignment this week
Quiz
- Submit the unit quiz by 8 pm on Wednesday, January 22nd
Unit 2: Exploratory Data Analysis
Reading
[NOTE: These are short chapters. You are reading to understand the framework of visualizing data in R. Don’t feel like you have to memorize the details. These are reference materials that you can turn back to when you need to write code!]
- Wickham, Çetinkaya-Rundel, and Grolemund (2023) Chapter 1, Data Visualization
- Wickham, Çetinkaya-Rundel, and Grolemund (2023) Chapter 9, Layers
- Wickham, Çetinkaya-Rundel, and Grolemund (2023) Chapter 10, Exploratory Data Analysis
Slide decks
Videos
Lecture 1: Stages of Data Analysis and Model Development ~ 10 mins
Lecture 2: Best Practices and Other Recommendations ~ 27 mins
Lecture 3: EDA for Data Cleaning ~ 41 mins
Lecture 4: EDA for Modeling - Univariate ~ 24 mins
Lecture 5: EDA for Modeling - Bivariate ~ 20 mins
Lecture 6: Working with Recipes ~ 13 mins
Lab Materials (Zihan - Jan 28th)
Application Assignment
cleaning EDA: qmd
modeling EDA: qmd
solutions: cleaning EDA; modeling EDA
Submit the application assignment by 8 pm on Wednesday, January 29th.
Quiz
- Submit the unit quiz by 8 pm on Wednesday, January 29th.
Unit 3: Introduction to Regression Models
Reading
- James et al. (2023) Chapter 3, pp 59 - 109
Slide decks
Videos
Lecture 1: Overview ~ 13 mins
Lecture 6: Extension to Interactions and Non-Linear Effects ~ 11 mins
Lecture 7: Introduction to KNN ~ 9 mins
Lecture 8: The hyperparameter k ~ 13 mins
Lecture 10: KNN with Ames ~ 12 mins
Lab Materials
Application Assignment
Submit the application assignment by 8 pm on Wednesday, February 5th.
Quiz
- Submit the unit quiz by 8 pm on Wednesday, February 5th.
Unit 4: Introduction to Classification Models
Reading
- James et al. (2023) Chapter 4, pp 129 - 164
Slide decks
Videos
Lecture 1: The Bayes Classifier ~ 9 mins
Lecture 2: Conceptual Overview of Logistic Regression ~ 19 mins
Lecture 3: EDA with the Cars Dataset ~ 12 mins
Lecture 5: KNN with Cars Dataset ~ 19 mins
Lecture 7: Comparisons among Classifiers ~ 11 mins
Lab Materials
Application Assignment
shells: cleaning EDA qmd; rda qmd; knn qmd
solution: modeling EDA; rda; knn
Submit the application assignment by 8 pm on Wednesday, February 12th.
Quiz
- Submit the unit quiz by 8 pm on Wednesday, February 12th.
Unit 5: Resampling Methods for Model Selection and Evaluation
Reading
Kuhn and Johnson (2018) Chapter 4, pp 61 - 80
Supplemental: James et al. (2023) Chapter 5, pp 197 - 208 186
Slide decks
Videos
Lecture 2: Introduction to Resampling ~ 11 mins
Lecture 7: Bootstrap Resampling ~ 11 mins
Lecture 8: Using Resampling to Select Best Model Configurations ~ 17 mins
Lecture 9: Resampling for Both Model Selection and Evaluation ~ 11 mins
Lecture 10: Nested Resampling ~ 14 mins
Lab Materials
Application Assignment
Submit the application assignment by 8 pm on Wednesday, February 19th.
Quiz
- Submit the unit quiz by 8 pm on Wednesday, February 19th.
Unit 6: Regularization and Penalized Models
Reading
- James et al. (2023) Chapter 6, pp 225 - 267
Slide decks
Videos
Lecture 1: An Introduction to Penalized/Regularized Algorithms ~ 15 mins
[Lecture 2: Intuitions about Penalized Cost Functions and Regularization ~ 11 mins
Lecture 3: Ridge Regression ~ 9 mins
Lecture 4: LASSO ~ 8 mins
Lecture 5: The Elastic net ~ 4 mins
Lecture 6: Emprical Example - Many good predictors ~ 23 mins
Lecture 7: Emprical Example - Good and zero predictors ~ 9 mins
Lecture 8: Emprical Example - LASSO for covariate selection ~ 8 mins
Lab Materials
Application Assignment
Submit the application assignment by 8 pm on Wednesday, February 26th.
Quiz
- Submit the unit quiz by 8 pm on Wednesday, February 26th.
Unit 8: Advanced Performance Metrics
Reading
- Kuhn and Johnson (2018) Chapter 11, pp 247-266
- Kuhn and Johnson (2018) Chapter 16, pp 419-435
- Wyant et al, in press
Slide decks
Videos
Lecture 1: Unit Introduction ~ 15 mins
Lecture 4: The Receiver Operating Characteristic (ROC) Curve ~ 25 mins
Lecture 5: Selecting Model Configurations with Other Metrics ~ 10 mins
Lecture 6: Addressing Class Imbalance ~ 24 mins
Lab Materials
Application Assignment
Quiz
- Submit the unit quiz by 8 pm on Wednesday, March 12th.