Human Bias in Machine Learning: How Well Do You Really Know Your Model?

Author: Jim Box, Elena Snavely, and Hiwot Tesfaye

Publisher: SAS Global Forum

Publication Year: 2022

Summary: The following paper discusses how it is a common consensus that artificial intelligence (AI) and machine learning (ML) are going to solve problems by letting an objective algorithm make decisions. The issue is that the algorithms are not always objective. Biased data produces biased results. In fact, biases are often amplified by the use of an algorithm. Sometimes this bias is overt while other times it is more subtle. Models can also find unexpected correlations based on confounding variables that are not practically useful. In order to minimize bias, problems must be framed through the understanding of technical aspects. Then fairness must be defined in the context of the problem. It is also important to check for bias in the training data. It must be representative. Models must also be interpretable. Partial dependence and individual conditional expectation plots are good model diagnostic techniques that aid interpretability. Local interpretable model-agnostic explanations (LIME) are great for individual predictions. Lastly, diversity is necessary in the model building process. Diversity of perspectives, ideas, and approaches come from a diverse team. It is vital to be careful of biases throughout the creation of AI and ML algorithms as they have large implications.