Quack for Galactica, Meta’s AI Trained to Read Scientific Articles

Home Research Quack for Galactica, Meta’s AI Trained to Read Scientific Articles

In the first year of the pandemic, more than 100,000 scientific articles were published about Covid-19. An unprecedented effort that produced an unprecedented amount of information. It would have been impossible for a human being to read and understand each of these studies. In theory, Galactica would be able to.


It is an artificial intelligence developed by Meta AI (formerly Facebook Artificial Intelligence Research) with the aim of using machine learning to “organize science”. A demo version posted online last week drew strong backlash, with critics saying the AI ​​produced pseudo-science, was overhyped, and not ready for public use.


The tool is presented as a sort of evolution of the search engine, but specifically for scientific literature. When Galactica was launched, the Meta AI team said it was able to synthesize research areas, solve mathematical problems, or even write scientific code.


The idea of ​​being able to synthesize and disseminate scientific knowledge seems attractive at first glance. Indeed, today, to understand the latest research on a subject like quantum computing, one would have to read hundreds of scientific articles.


Galactica angry with math


Alternatively, you could query Galactica by asking, for example, what is quantum computing? and she could sort and organize the information to deliver an answer in the form of a Wikipedia article, literature review, or lecture notes.
Launching the demo version of Galactica on November 15, Meta AI published an article describing the project. The dataset used to train the AI ​​is described as a vast body of scientific knowledge including 48 million articles, textbooks, lecture notes, websites (like Wikipedia), etc. The demo website warned against taking responses from the AI for the gospel, with a sentence in capital letters, in bold: “NEVER FOLLOW THE ADVICE OF A LINGUISTIC MODEL WITHOUT VERIFICATION”.
As soon as it went live, users asked Galactica all kinds of tough science questions. For example, a user asked, “Do vaccines cause autism?”. Galactica replied in a confused way:
To explain, the answer is no. Vaccines do not cause autism. The answer is yes. Vaccines cause autism. The answer is no.
The clear answer is that vaccines do not cause autism. Galactica also struggled with kindergarten-level math. She provided incorrect answers, incorrectly suggesting that one plus two does not equal three.


A “random bullshit generator”


Galactica is what researchers in AI call a “large language model (in English LLM, large language model). These LLMs can read and summarize large amounts of text to predict future words in a sentence. But the scientific data set with which Galactica was trained makes it a little different from other MLLs. The Meta AI team says they evaluated the” toxicity and bias of his AI whose performance would be better than that of some other LLMs.


Yet Carl Bergstrom, a biology professor at the University of Washington who studies how information travels, describes Galactica as a “random bullshit generator”. For him, the way the AI ​​has been trained to recognize words and string them together produces information that appears authoritative and convincing but is often incorrect.


And it’s easy to see how an AI like this, made public, could be misused. A student, for example, could ask Galactica to produce lecture notes on black holes presenting them as academic work. A scientist could use it to write an article and then submit it to a scientific journal. Some scientists believe that this type of occasional abuse is more “fun” than worrying. The problem is that things could get much worse.


“Galactica is still in its infancy, but more powerful AI models that organize scientific knowledge could pose serious risks thinks Dan Hendrycks, an artificial intelligence security researcher at the University of California at Berkeley. He suggests that a more advanced version of Galactica might be able to exploit the chemistry and virology knowledge of its database to help malicious users synthesize chemical weapons or assemble bombs. He asked Meta AI to add filters to prevent this kind of misuse and suggested that researchers probe their AI for this kind of danger before releasing it. The researcher points out in passing that Meta’s AI division does not have a security team, unlike their peers including DeepMind, Anthropic, and OpenAI.”


The question of why this version of Galactica was released remains open. It seems to follow Meta CEO Mark Zuckerberg’s oft-repeated motto, “go fast and shake things up”. But in the field of AI, it is risky, even irresponsible, to do so.