When you do a research on a famous person on Google, the usual reflex is to stop on his Wikipedia page. The site regularly appears in the top 10 of the most visited sites, but unfortunately, there are many inequalities at the level of the biographies.
Only 20% of biographies are about women, a percentage that drops even more in male-dominated fields (like science) or for underrepresented ethnic groups.
As part of her doctorate at the CNRS, Angela Fan, researcher for Meta AI, joined forces with her thesis director Claire Gardent (computer researcher) to build an open source artificial intelligence capable of generating biographies in the usual Wikipedia format. .
A three-step challenge
In an article published this Wednesday, the researchers published their results and their methodologies. They have also made all of their research and results public to pave the way for researchers to achieve fairer representation on the web.
To date, there are groups like Wikimedia Foundation, WikiProject Woen or Women in Red that focus on removing bias from existing content. These prejudices are often due to misuse of language, machine translation problems or linguistic subtleties.
To generate biographies, artificial intelligence must meet 3 challenges:
- gather relevant information
- generate text that correctly structures this information
- check that the text is factually correct
An AI to generate typical biographies
Artificial intelligence therefore generates new biographies instead of modifying existing ones. In this model, we find an introductory paragraph, information related to the birth and life of the subject and then his career.
Each section is built in 3 steps: searching for relevant information on the web, generating the text, predicting the next section and listing the citations. The researchers were thus able to generate 1,500 records by simply entering the name of the person, his or her occupation(s) and a section title.
The data was then analyzed automatically, but also human to verify the veracity of the content produced.
AI has its limits
An artificial intelligence is perfectly capable of generating correct individual sentences. This becomes more complicated when the sentences get longer or when it is necessary to produce a coherent complete text.
Indeed, if at first glance the result may seem correct, many redundancies or contradictions may appear in the process. The difficulty is all the greater because the number of secondary sources is often too small to accurately verify the facts.
Wikipedia supports 309 different languages, multilingual sources are also a problem. If English, French and German are particularly well represented, other languages pose a problem (such as those spoken in Africa), because the sources are too few.
In the end, all of his data is just a first step that “addresses only part of a multi-faceted problem,” Fan said, but it’s interesting to see that tools like artificial intelligence can be used. to move towards reducing inequalities.
Read also :
Source: venture beat
We wish to say thanks to the author of this article for this awesome web content
Wikipedia: artificial intelligence to fight against sexism – CNET France
You can find our social media accounts as well as other related pageshttps://www.ai-magazine.com/related-pages/