Course Materials

Unit 1: Overview

Reading

Slide decks

Videos

Lab Materials

  • None this week

Application Assignment

  • No assignment this week

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, January 22nd

Unit 2: Exploratory Data Analysis

Reading

[NOTE: These are short chapters. You are reading to understand the framework of visualizing data in R. Don’t feel like you have to memorize the details. These are reference materials that you can turn back to when you need to write code!]

Slide decks

Videos

Lab Materials (Zihan - Jan 28th)

Application Assignment

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, January 29th.

Unit 3: Introduction to Regression Models

Reading

Slide decks

Videos

Lab Materials

Application Assignment

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, February 5th.

Unit 4: Introduction to Classification Models

Reading

Slide decks

Videos

Lab Materials

Application Assignment

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, February 12th.

Unit 5: Resampling Methods for Model Selection and Evaluation

Reading

Slide decks

Videos

Lab Materials

Application Assignment

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, February 19th.

Unit 6: Regularization and Penalized Models

Reading

Slide decks

Videos

Lab Materials

Application Assignment

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, February 26th.

Unit 8: Advanced Performance Metrics

Reading

Slide decks

Videos

Lab Materials

Application Assignment

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, March 12th.

Unit 9: Decision Trees, Bagging, and Random Forest

Reading

In addition, much of the content from this unit has been drawn from four chapters in a book called Hands On Machine Learning In R. It is a great book and I used it heavily (and at times verbatim) b/c it is quite clear in its coverage of these algorithms. If you want more depth, you might read chapters 9-12 from this book as a supplement to this unit in our course.

Slide decks - Lecture - Discussion

Videos

Lab Materials

Application Assignment

Quiz

  • Submit the unit quiz by 8 pm on Wednesday, March 19th.

Unit 10: Neural Networks

Reading

Slide decks

Videos

Lab Materials

Application Assignment

Submit the application assignment here by noon on Friday, April 4th

Quiz

Complete the unit quiz by 8 pm on Wednesday, April 2rd

Unit 11: Explanatory Approaches

Reading

  • Benavoli et al. (2017) paper: Read pages 1-9 that describe the correlated t-test and its limitations.
  • Kruschke (2018) paper: Describes Bayesian estimation and the ROPE (generally, not in the context of machine learning and model comparisons)

And these chapters in the book Interpretable Machine Learning. They are all short!

Slide decks

Videos

Note the lab record of this week was mistakenly limiated to the speaker view not the screen until about 11’. But the contents are all in the lab html. Keras demo and early stop usage starts from around 45’.

Lab Materials

Application Assignment

Submit the application assignment here by noon on Friday, April 11th

Quiz

Complete the unit quiz by 8 pm on Wednesday, April 9th

Unit 12: NLP

Reading

NOTES: Please read the above chapters more with an eye toward concepts and issues rather than code. I will demonstrate a minimum set of functions to accomplish the NLP modeling tasks for this unit.

Also know that the entire Hvitfeldt and Silge (2022, book) is really mandatory reading. I would also strongly recommend this entire Silge and Robinson (2017) book. Both will be important references at a minimum.

Slide decks

Videos

Lab Materials

Application Assignment

Submit the application assignment here by noon on Friday, April 18th

Quiz

Complete the unit quiz by 8 pm on Wednesday, April 16th

References

Benavoli, Alessio, Giorgio Coraniy, Janez Demsar, and Marco Zaffalon. 2017. “Time for a Change: A Tutorial for Comparing Multiple Classifiers Through Bayesian Analysis.” Journal of Machine Learning Research 18: 1–36.
Hvitfeldt, Emil, and Julia Silge. 2022. Supervised Machine Learning for Text Analysis in R. https://smltar.com/.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2023. An Introduction to Statistical Learning: With Applications in R. 2nd ed. Springer Texts in Statistics. New York: Springer-Verlag.
Kruschke, John K. 2018. “Rejecting or Accepting Parameter Values in Bayesian Estimation.” Advances in Methods and Practices in Psychological Science 1: 270–80.
Kuhn, Max, and Kjell Johnson. 2018. Applied Predictive Modeling. 1st ed. 2013, Corr. 2nd printing 2018 edition. New York: Springer.
Molnar, Christoph. 2023. Intepretable Machine Learning: A Guide for Makiong Black Box MOdels Explainable. 2nd ed. https://christophm.github.io/interpretable-ml-book/.
Silge, Julia, and David Robinson. 2017. Text Mining with R: A Tidy Approach. 1rst ed. Beijing; Boston: O’Reilly Media.
Wickham, Hadley, Çetinkaya-Rundel Mine, and Garrett Grolemund. 2023. R for Data Science: Visualize, Model, Transform, and Import Data. 2nd ed. https://r4ds.hadley.nz/.
Yarkoni, Tal, and Jacob Westfall. 2017. “Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning.” Perspectives on Psychological Science 12 (6): 1100–1122.