An analysis of 630 billion words posted online shows that people tend to think of men when they use gender-neutral terms, a sexist bias that AI models can learn
April 1, 2022
When people use neutral words like “people” and “human nature,” they tend to think of men rather than women, reflecting the sexism that exists in many societies, according to an analysis of billions of words posted online. .The researchers behind the work warn that this sexist bias is being passed on to artificial intelligence A model that has been trained on the same text.
April Bailey NYU colleagues and colleagues used statistical algorithms to analyze a collection of 630 billion words contained in 2.96 billion web pages collected in 2017, including informal texts from blogs and forums as well as more formal texts written by the media, corporations and governments, mainly is English. They used a method called word embedding, which derives the expected meaning of a word by how often it occurs in context with other words.
They found that words such as “person,” “person,” and “humanity” were used in a more contextual context than words such as “woman,” “she,” and words such as “man,” “him,” and “male.” “with her”. Because the use of these gendered words is more similar to those referring to men, one might perceive them as more masculine in concept – reflecting a male-dominated society, the team said. They account for the fact that men may be overrepresented as authors in their dataset and find that this does not affect the results.
An open question, the team says, is to what extent this relies on English — other languages, such as Spanish, include explicit gender information that could alter the results. The team also did not consider non-binary gender identities or the biological and social aspects of distinguishing between sex and gender.
Bailey said it was not surprising to find evidence of sexism in English, as previous research has shown that words like ‘scientist’ and ‘engineer’ are also thought to be more associated with words like ‘man’ and ‘male’ woman” closer. ” and “women.” But she said that should be of concern because the same collection of texts searched by this study was used to train a series of artificial intelligence tools From language translation websites to conversational bots, this will inherit this bias.
“It learns from us, and we learn from it,” Bailey said. “And we’re kind of in this reciprocity cycle, and we reflect it back and forth. It’s concerning because it shows that if I snap my fingers now and magically get rid of everyone’s personal cognitive bias of thinking a person is a man and not a woman, We still have this bias in our society because it’s so ingrained in AI tools.”
Journal references: scientific progress, DOI: 10.1126/sciadv.abm2463
More information on these topics: