14  Ethical Issues in Machine Learning Research and Applications

14.1 Overview of Unit

14.1.1 Learning Objectives

All semester, we have been learning how to develop and evaluate machine learning models. In this final unit, we will now consider the impact that these models have had to date on society. Of course, there is the potential for many benefits from these models but they have also produced substantial harm. It is important that we recognize what contributes to their harm so that we can strive to avoid these problems in our own work.

14.2 Readings

  • The readings this week will come from O’Neil (2016); We will read the introduction, chapters 1, 3, 5, and the conclusion and afterword sections. A pdf of the book will be shared directly with you.

  • We will also read this article on emerging methods and tools for assessing model fairness.

14.3 Lecture Videos, Application Assignment, and Quiz

There are no lecture videos, or application assignment this week. The unit quiz will be free-response questions on the reading to help you prepare your thoughts to share during our discussion.

The [unit quiz] (https://canvas.wisc.edu/courses/395546/quizzes/514052) is due by 8 pm on Wednesday, May 1st. We will meet on Tuesday to discuss these assigned papers and Thursday for a review session for the concepts final exam.

14.4 Discussion

14.4.1 Announcements

  • Quiz
  • Final concepts
  • Final applications
  • Review
  • Office hours and meetings

14.4.2 Discuss Model Scale

  • What is it?
  • Examples
  • Costs and Benefits
  • Role of homegeneity

14.4.3 Discuss Model Opacity

  • What is it?
  • Examples
  • Inputs, Outputs, Algorithm
  • Connection to high dimensionality
  • Connection to model performance (why is college rank system ostentiably opaque)
  • Intersection with models in health care

14.4.4 Discuss Proxy outcomes

  • What do we mean when the outcome is a proxy
  • Is this common in models, in social science?
  • Two examples from book (recidivism, college rankings)
  • Examples?
  • What is the problem?

14.4.5 Discuss issue of self-perpetuating (feedback loops) vs self-correcting

  • What is the issue/problem
  • Describe the problem in the context of book examples (college ranks, recidivism, credit scores)
  • Describe a novel example
  • How does this issue interact with opacity, errors, and ability to appeal?

14.4.6 Are our models objective?

“Values are reflected in what we target. My point is that police make choices about where they direct their attention. Today they focus almost exclusively on the poor. That’s their heritage, and their mission, as they understand it. And now data scientists are stitching this status quo of the social order into models, like PredPol, that hold ever-greater sway over our lives.”

“Would society be so willing to sacrifice the concept of probable cause if everyone had to endure the harassment and indignities of stop and frisk? Chicago police have their own stop-and-frisk program. In the name of fairness, what if they sent a bunch of patrollers into the city’s exclusive Gold Coast? Maybe they’d arrest joggers for jaywalking from the park across W. North Boulevard or crack down on poodle pooping along Lakeshore Drive. This heightened police presence would probably pick up more drunk drivers and perhaps uncover a few cases of insurance fraud, spousal abuse, or racketeering. Occasionally, just to give everyone a taste of the unvarnished experience, the cops might throw wealthy citizens on the trunks of their cruisers, wrench their arms, and snap on the handcuffs, perhaps while swearing and calling them hateful names. In time, this focus on the Gold Coast would create data. It would describe an increase in crime there, which would draw even more police into the fray. This would no doubt lead to growing anger and confrontations. I picture a double parker talking back to police, refusing to get out of his Mercedes, and finding himself facing charges for resisting arrest. Yet another Gold Coast crime.”

  • Examples of models of “what* we target with our models – nuisance crimes vs. white collar
  • Examples of problems with where models applied - Stop and frisk as WMD.
  • Examples of how we define our constructs (Is tuition/cost part of being a good school?)
  • Bias can enter through proxies. What about inputs, observed associations, and training data?

14.4.7 What should and should not be included as inputs to a model?

  • Are there inputs that shouldnt be allowed?
  • Implications of including race/ethnicity as feature in our models?

14.4.8 Damage

  • How well does it perform?
  • How well for subgroups?
  • Accuracy not enough if costs of different errors are different
  • What is it used for?
  • How strongly is it used?
  • What if there was no model?

14.4.9 Critique RISK and RISK2 ethics

  • Who developed the model?
  • Who provided the data?
  • What are values or assumptions? Lapse?
  • How will it be used - by who and for what?
  • How should it be evaluated for implementation?
  • Is it fair?
  • How could we evaluate it in an ongoing fashion?
  • Unanticipated consequences of its use?
O’Neil, Cathy. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Reprint Edition. Broadway Books.