Computer vision, a well-known and very useful branch of AI, is approaching a small revolution thanks to advances in automatic labeling systems.
For us humans, object recognition is child’s play. It is indeed a skill that we actively develop from a very young age, and the human brain is fantastically gifted to carry out this kind of task with formidable precision.
He is even so gifted that machines regularly ask our opinion on the matter; this is the case every time we complete certain types of captchas. Indeed, it is not for fun that Google and consorts regularly ask you which box has a sailboat, a pedestrian crossing or one of those pesky traffic lights.
The responses of Internet users are then used as references to improve the reliability of autonomous systems based on artificial intelligence. By multiplying the samples in this way, the big names in AI hope to make their creations more and more reliable.
But today, it seems obvious that this approach is not enough on its own; if it was enough to ask the general public if an image represents a bus or a truck to train a generalist AI called “strong”, it is a long time since our civilization would have changed completely.
Labeling, the ordeal of AI researchers
When it comes to developing systems that are both ultra-accurate and reliable, very few approaches are viable – and trusting exasperated internet users is definitely not one of them. Instead, you have to spend a lot of time to ensure the reliability of the information that the neural network will ingest.
If they are approximate, the AI will be based on erroneous data and therefore the results will not necessarily be meaningful. In short: “gtreeing in, garbage out” (waste at the entrance, waste at the exit), as the specialists in the discipline say.
This requires researchers to select the images one by one to “draw” colored masks on them, each delimiting an element supposed to be identified by the AI, as below. This is called data labeling.
It is an extremely time-consuming process that counts regularly in hundreds of hours. Indeed, the databases in question bring together thousands, even hundreds of thousands of images. And no question of rushing the job; the quality of the final product depends directly on the amount of data available.
This approach therefore seems excessively paradoxical, not to say archaic in a field as specialized as AI research. This is a problem that is all the more significant since this time could be devoted to substantive work, which is much more important for the development of this technology.
Researchers are therefore trying to develop systems capable of perform this excessively thankless task for them. So far, the results have always been mixed in terms of quality. Moreover, this approach involves working pixel by pixel; no need to be a great computer scientist to understand that this quickly poses a problem of computing power. After all, it involves using hundreds of thousands of images that must all be consistent from start to finish.
An algorithm to pre-chew the work
The human brain remains the great specialist in this discipline. But the latest work by MIT researchers spotted by Engadget maybe come from significantly reduce the gap. With help from Cornell University and Microsoft, MIT developed an algorithm called STEGO. Its objective: to label images autonomously in record time and with pixel precision.
“The idea is that these algorithms can define coherent sets largely automatically so that we don’t have to do it ourselves.”, explains Mark Hamilton, lead author of the study.
To achieve this, this algorithm analyzes the entire dataset in search of recurring objects, which appear several times over the images. “It then associates them to build a coherent final result on all the images from which it learns”, explains the team in a press release.
The researchers then compared the results of STEGO to other autonomous labeling systems. And the result was quite striking. They explain that STEGO showed up at least twice as efficient as its congeners. This is the first time that an algorithm of this type has aligned almost perfectly with images of controls labeled by humans.
It’s a big progress; this could allow many researchers to dramatically increase the speed at which they can annotate huge datasets. But it will also be very simplistic to limit the impact of autonomous systems like STEGO to simple productivity.
Transcend human limits once and for all
The main interest of this method is to be able to identify complex patterns that humans are not able to label with precision. “If you’re looking at oncology scans, images of a planet’s surface, or high-resolution microbiological images, it’s hard to know where to look without being a real expert.”, explain the researchers.
“In some areas, even human experts don’t know what the objects in question look like.”, adds Hamilton. “In this type of situation where we operate at the frontiers of science, we cannot rely on humans to understand before the machine”, he specifies.
A self-supervised system of this kind could thus work real miracles in certain areas. Just think of cancer diagnosis or environmental recognition in autonomous vehicles. But it is only the tip of a huge iceberg possible applications.
There is still work to get to this stage. For example, as it stands, STEGO still suffers from some limitations. For example, it is possible to make him completely lose control by submitting an eccentric image like a banana placed on the receptacle of a landline telephone. That good old garbage in, garbage out is therefore still valid. But it is not probably only a matter of time before STEGO and its successors become mature enough to give rise to a real revolution in this very important niche of artificial intelligence.
We would like to thank the author of this post for this awesome content
Artificial intelligence: computer vision is approaching a crucial milestone
You can find our social media pages here and other pages related to them here.https://www.ai-magazine.com/related-pages/