Data Brew Season 2, Episode 2: Data Ethics

Author: Databricks

Publisher: Spotify

Publication Year: 2021

Summary: The following podcast episode discusses how in the realm of business and marketing, the regulation of fairness and data ethics are essential elements to consider when building a model. A pressing issue caused by lack of data ethics includes the breeding of disparity when decisions based on data are made around people’s fundamental needs as well as the lack of understanding about the harmful impact of models. To promote fairness, data scientists cannot provide protected attributes, such as gender or race, as input to a model. However, outcomes can be correlated to attributes, like your gender could correlate to your frequent online shopping at Sephora which can then inform marketing decisions. One example of a useful predictor is income as it is important to certain domains of business and can be used for business justification. With considering income comes social inequalities and implications of unfair representation of certain groups, so further analysis is needed to ensure attributes are used not to represent someone but actually add value to the business decision. Although there often exists a trade-off between fairness and accuracy of a model, adversarial devising can maximize desired metrics and confusion of adversary by hiding protected attributes to help choose the fairest model. Questions of bias are asked across industries but are especially critical for data science and AI because algorithms have such a large scale and affect a considerable population that they have a greater potential to amplify societal biases. When asked to provide advice to other fields on increasing fairness of algorithms, the podcasters highlighted awareness and transparency as major themes, particularly urging the consideration of how the model can perpetuate systemic bias and being transparent about what attributes are being used for decision-making and the algorithm can directly affect others.