IBM’s Photo-Scraping Scandal Shows What a Weird Bubble AI Researchers Live In

Keywords: Artificial Intelligence, Data Scraping, Facial Recognition Systems, Flickr, IBM, Instagram, New York Times, Public Data, Wall Street Journal, Web Scraping

Author: Karen Hao

Publisher: MIT Technology Review

Publication Year: 2019

Summary: The following article discusses how in March of 2019, NBC released a story with the headline “Facial recognition ‘dirty little secret’: Millions of online photos scraped without consent.” The story expands on a data set released by IBM of one million photos of faces to help fairer facial recognition algorithms. However, those photos were scraped from Flickr without the permission of the subjects or photographers. For those in the industry, IBM did not do anything unusual, they were scraping data from the internet and feeding it to the algorithms for training purposes. This is commonplace; Instagram, New York Times, and Wall Street journals are all considered customary sources of data. This is not to say data scraping is right or wrong, there are benign ways to go this. However, this story emphasizes the importance of the tech industry to adapt its standard practices with the evolution of technology and the public awareness. Ruman Chowdhury, global lead for responsible artificial intelligence (AI) at Accenture commented, there are ways to use data that we were not aware of over five years ago so how could the public have agreed to a capability that did not exist. In short, it may have been feasible a while back to scrape people’s public data with implicit consent of it being publicly available; however, with the rise of AI and unparalleled scale of Silicon Valley’s data monopolization and monetization, it has completely transformed ethical AI. Technologists have the responsibility of changing alongside it and making sure there is comprehensive and informed societal consensus for their practices.

Posted

April 27, 2023

2019, Legal & Policy, News Article, Notable People, Privacy

lnbressa

Tags:

Artificial Intelligence, Data Scraping, Facial Recognition Systems, Flickr, IBM, Instagram, New York Times, Public Data, Wall Street Journal, Web Scraping