Author: Cathy O’Neil
Publisher: Slate
Publication Year: 2016
Summary: The following article discusses when deciding which variables to include in a model, some factors (race, etc.) can be easily excluded to ensure that racial bias in data as a result of systemic racism does not play a role in how the model makes predictions. However, there are many other variables that can have the same effect that may not be as easy to detect. One example mentioned in the article is neighborhoods or zip codes. Given how segregated many neighborhoods still are due to systemic issues involving race, using variables regarding neighborhoods often has the same biases as directly using race as a variable would. For this reason, it is important to be careful when examining data and determining which variables can impact our model negatively from an ethical perspective.