Author: James Zou, Londa Schiebinger
Publisher: Nature
Publication Year: 2018
Summary: The following article discusses different sources of bias, how to recognize the bias, and what to do about it. They bring up that a major driver of bias in artificial intelligence (AI) is the training data, which can produce data encoded with gender, ethnic, and cultural biases. Another cause of bias is within the algorithms where sometimes maximizing predictive accuracy causes a specific group of individuals to appear more frequently than others in the training data, thus optimizing for those individuals. When the algorithms are evaluated on ‘test’ data sets, these are usually sub-samples of the training set and are likely to contain the same biases. A solution to this problem is to systematically label the content of training data sets with standardized metadata. Each training set should have information on how the data were collected and if data is on people, summary statistics should be provided. One of the 2 authors, James Zou, is an assistant professor of biomedical data science and computer science and of electrical engineering at Stanford university. The other author, Londa Schiebinger, is a professor of history of science and the director of Gendered Innovation in Science, Health, & Medicine, Engineering, and Environment, at Stanford University.