Machine learning algorithms form biases, like humans, based on the data they observe. However, unlike humans, the algorithms can readily admit their biases when probed appropriately. Using publicly available lists of names, we enumerate biases in an unsupervised fashion from word embeddings trained on public data. Gender, racial, and religious biases emerge, among others. We then analyze the effects of these biases on a problem motivated by recommending jobs to candidates. To collect data for this task, we extract hundreds of thousands of third-person bios from the web. The straightforward application of machine learning is found to amplify some biases. However, unlike humans, it is easy to put in place algorithmic corrections to mitigate this bias amplification.

Joint work with: Maria De Arteaga (CMU); Alexey Romanov (UMass Lowell); Nat Swinger (Lexington HS); Tom Heffernan (Shrewsbury HS); Christian Borgs, Jennifer Chayes, and Hanna Wallach (MSR); Alex Chouldechova (CMU; Mark Leiserson (UMD); Sahin Geyik and Krishnaram Kenthapadi (LinkedIn)

Image for "An ICERM Public Lecture - Bias in bios: fairness in a high-stakes machine-learning setting"

About the Speaker

Adam Tauman Kalai received his BA from Harvard, and MA and PhD under the supervision of Avrim Blum from CMU. After an NSF postdoctoral fellowship at M.I.T. with Santosh Vempala, he served as an assistant professor at the Toyota Technological Institute at Chicago and then at Georgia Tech. He is now a Principal Researcher at Microsoft Research New England. His honors include an NSF CAREER award and an Alfred P. Sloan fellowship. His research focuses on Artificial Intelligence and Machine Learning algorithms.

Adam Tauman Kalai, Microsoft Research New England

Lecture Video