Here’s Why AI Can Be Easily Biased By Annotation Instructions

The research in the area of ​​machine learning and AI, which is now a key technology in virtually every industry and business, is far too voluminous for anyone to fully read. This column, Perceptron (formerly Deep Science), aims to bring together some of the most relevant recent discoveries and papers – especially in the field of artificial intelligence, but not only – and explain why they are important.

This week, a new study reveals how bias, a common problem in AI systems, can start with the instructions given to people recruited to annotate the data from which AI systems learn to make predictions. The co-authors found that annotators spot patterns in instructions, which conditions them to provide annotations that then become overrepresented in the data, which directs the AI ​​system to those annotations.

Today, many AI systems “learn” to make sense of images, videos, text, and sounds from examples that have been labeled by annotators. Labels allow systems to extrapolate relationships between examples (for example, the link between the caption “kitchen sink” and the photo of a kitchen sink) to data that systems have not seen before ( for example, photos of kitchen sinks that were not included in the data used to “train” the model).

It works remarkably well. But annotation is a flawed approach – annotators bring in biases that can trickle down to the trained system. For example, studies have shown that the average annotator is more likely to rate sentences in African American Vernacular English (AAVE), the informal grammar used by some Black Americans, as toxic, leading the toxicity detectors of the IA trained on labels to consider AAVE disproportionately toxic.

It turns out that the annotators’ predispositions may not be solely responsible for the presence of bias in training labels. In a preprint study from Arizona State University and the Allen Institute for AI, researchers investigated whether a source of bias might lie in the instructions written by dataset creators to serve as guides for annotators. . These instructions usually include a brief description of the task (for example, “Tag all the birds in these photos”) as well as several examples.

The researchers looked at 14 different “benchmark” datasets used to measure the performance of natural language processing systems, or AI systems capable of classifying, summarizing, translating, and analyzing or otherwise manipulating text. By studying the instructions given to annotators who worked on the datasets, they found evidence that the instructions influenced the annotators to follow specific patterns, which then spread to the datasets. For example, more than half of the annotations in Quoref, a dataset designed to test the ability of AI systems to understand when two or more phrases refer to the same person (or thing), begin with the phrase “What is the name”, a phrase found in one third of the instructions for the dataset.

This phenomenon, which the researchers call “instruction bias,” is particularly troubling because it suggests that systems trained on biased instruction/annotation data might not perform as well as initially thought. Indeed, the co-authors found that instructional bias overestimates the performance of systems and that systems often fail to generalize beyond instructional models.

On the bright side, large systems, like OpenAI’s GPT-3, have proven to be generally less susceptible to instructional bias. But this study is a reminder that AI systems, like people, are susceptible to developing biases from sources that aren’t always obvious. The challenge is to find these sources and mitigate their impact downstream.

In a less ominous paper, Swiss scientists concluded that facial recognition systems are not easily fooled by realistic AI-modified faces. “Morphing attacks”, as they are called, involve the use of AI to alter the photo of an ID card, passport or other form of identity document for the purpose of bypass security systems. The co-authors created “morphs” using AI (Nvidia’s StyleGAN 2) and tested them against four cutting-edge facial recognition systems. According to them, the morphs did not pose a significant threat, despite their larger-than-life appearance.

Elsewhere in the field of computer vision, researchers at Meta have developed an AI “helper” that can remember features of a room, including the location and context of objects, to respond to questions. Detailed in a preprint article, this work is likely part of Meta’s Nazare project, which aims to develop augmented reality glasses that use AI to analyze their surroundings.

The researchers’ system, which is designed for use on any body-worn device equipped with a camera, analyzes footage to build “semantically rich and efficient scene memories” that “encode spatio- temporal on objects. The system remembers the location of objects and when they appeared in the video sequence. In addition, it saves in its memory the answers to questions that a user might ask about the objects. For example, to the question “Where did you last see my keys?” the system may indicate that the keys were on a table in the living room that morning.

Meta, which plans to bring full-featured AR glasses to market in 2024, announced its ‘egocentric’ AI plans last October with the launch of Ego4D, a long-term research project on the ‘egocentric perception’ of AI. . The company said at the time that the goal was to teach AI systems to, among other things, understand social cues, how the actions of the wearer of an augmented reality device can affect their surroundings, and how hands interact with objects.

From language and augmented reality to physical phenomena: an AI model helped in an MIT study of waves – how and when they break. Although it sounds a little obscure, the truth is that wave models are needed both to build structures in and near water, and to model how the ocean interacts with the atmosphere in climate models.

Normally, waves are roughly simulated by a set of equations, but the researchers trained a machine-learning model on hundreds of wave cases in a 40-foot tank of water filled with sensors. By observing the waves and making predictions based on empirical evidence, then comparing them to theoretical models, AI helped show the shortcomings of the models.

A startup was born from research at EPFL, where Thibault Asselborn’s thesis on the analysis of handwriting turned into a real educational application. Using algorithms he designed, the app (called School Rebound) can identify habits and corrective actions in just 30 seconds when a child writes on an iPad with a stylus. These measures are presented to the child in the form of games that help him to write more clearly by reinforcing good habits.

“Our scientific model and our rigor are important, and this is what distinguishes us from other existing applications,” Asselborn said in a press release. “We have received letters from teachers who have seen their students improve by leaps and bounds. Some students even come before class to practice. »

Another new finding in primary schools relates to the identification of hearing problems during routine screenings. These screenings, which some readers may recall, often use a device called a tympanometer, which must be operated by trained audiologists. If there isn’t, for example in an isolated school district, children with hearing problems may never get the help they need in time.

Samantha Robler and Susan Emmett of Duke decided to build a tympanometer that essentially works on its own, sending data to a smartphone app where it’s interpreted by an AI model. Anything of concern is reported and the child may benefit from further screening. This system does not replace an expert, but it is much better than nothing and can help identify hearing problems much earlier in places that don’t have the proper resources.

We want to give thanks to the author of this post for this incredible material

Here’s Why AI Can Be Easily Biased By Annotation Instructions

We have our social media profiles here and other related pages here