How to measure the carbon impact of AI? This start-up thinks it has found the solution

Large Language Models (LLM) have a well-kept secret: their development and operation require large amounts of energy. Furthermore, the true extent of the carbon footprint of these models remains a mystery. The start-up Hugging Face thinks it has found a way to calculate this footprint more accurately by estimating the emissions produced during the entire life cycle of the model and not just during its development.

The attempt could be a step towards getting more realistic data from tech companies about the carbon footprint of their artificial intelligence (AI) products at a time when experts are calling on the industry to better assess the impact environment of AI. Hugging Face’s work is published in an article which has not yet been peer reviewed.

In order to test its new approach, Hugging Face evaluated the global emissions of its own language model called BLOOM. The latter was launched at the beginning of the year. This process involved adding up many numbers: the amount of energy used to train the model on a supercomputer, the energy needed to manufacture the supercomputer’s hardware and maintain its computing infrastructure, and the energy used to make BLOOM work once it has been deployed. The researchers calculated this last part using a software tool called CodeCarbon. This monitored the carbon emissions produced by BLOOM in real time over an 18-day period.

Hugging Face estimated that the development of BLOOM had caused the emission of 25 tons of carbon. But the researchers found that this figure doubled when they took into account the emissions produced by the manufacture of the computer hardware used for the development of this LLM, the broader computing infrastructure as well as the energy required to run BLOOM a once the development phase is complete.


Pressure, burnout… The quest for responsible AI is at the expense of experts

Helping the AI ​​community get a better idea of ​​its impact on the environment

While this number may seem high for a single model – 50 metric tons of carbon emissions or the equivalent of approximately 60 flights from London to New York – it is significantly lower than the emissions associated with other LLMs in the same size. This is because BLOOM was developed by a French supercomputer powered primarily by nuclear energy which produces no carbon emissions. Models formed in China, Australia or parts of the United States, whose energy networks rely more on fossil fuels, are likely to be more polluting.

After BLOOM was launched, Hugging Face estimated that using the model emitted around 19 kilograms of carbon dioxide every day. A number that is similar to the emissions produced by an average new car driven just over 85 kilometres.

For comparison, OpenAI’s GPT-3 and Meta’s OPT are estimated to emit more than 500 and 75 metric tons of carbon dioxide, respectively, during development. The extent of GPT-3 emissions can be partly explained by the fact that it was trained on older, less efficient hardware. But it is difficult to say with certainty about these figures. There is no standard way to measure carbon emissions and these figures are based on external estimates or, in the case of Meta, limited data published by the company.

“Our goal was to go beyond just the carbon emissions of electricity consumed during development and consider more of the lifecycle to help the AI ​​community have a better idea of ​​its impact on the environment and how we can start to reduce that impact,” says Sasha Luccioni, researcher at Hugging Face and lead author of the paper.

The carbon footprint of language models

The Hugging Face article sets a new standard for organizations developing AI models, says Emma Strubell, an assistant professor at Carnegie Mellon University’s school of computer science, who in 2019 wrote a seminal article on the impact of AI on the climate. Emma Strubell did not take part in this new research.

This paper “represents the most in-depth, honest, and well-researched analysis of the carbon footprint of a large machine learning (ML) model to date, as far as I know, going into far more detail…than any other article [ou] report that I know”, emphasizes Emma Strubell.

According to Lynn Kaack, an assistant professor of computer science and public policy at the Hertie School in Berlin, who was not involved in Hugging Face’s work, the article also sheds much-needed light on the extent of the footprint. carbon of large language models. She says she was surprised to see the scale of the life cycle emissions figures but feels there is still a lot to be done to understand the environmental impact of large language models in the real world.

“We need to better understand the far more complex downstream effects of AI use and abuse…It’s much harder to estimate. That’s why this part is often overlooked” , details Lynn Kaack. The latter co-wrote an article published last summer in the Nature magazine which offered a way to measure on-chain emissions caused by AI systems.

The technology sector is responsible for around 2% to 4% of global greenhouse gas emissions

For example, recommendation and advertising algorithms are often used in advertising which in turn tricks people into consuming and buying more stuff resulting in more carbon emissions. According to Lynn Kaack, it is also important to understand how artificial intelligence models are employed. Many companies, such as Google and Meta, use AI models to rank user comments or recommend content to them. Taken individually, these actions consume very little energy, but given that they are carried out a billion times a day, they add up and emissions increase.

It is estimated that the technology sector, as a whole, is responsible for 1.8% to 3.9% of global greenhouse gas emissions. Although AI and machine learning are only responsible for a fraction of these emissions, the carbon footprint of AI is still very high for a single technology sector.


Climate change: Bill Gates’ venture capital fund is still investing in green technologies

Thanks to a better understanding of the amount of energy consumed by AI systems, companies and developers can make choices about the trade-offs they are willing to make between pollution and costs generated, analyzes Sasha Luccioni.

A “warning signal” for major technology groups

The authors of the Hugging Face article hope that companies and researchers will be able to think about how they can develop great language models while limiting their carbon footprint, says Sylvain Viguier, co-author of the article in question and director of applications at Graphcore, a semiconductor company.

It could also encourage people to move towards more efficient methods of AI research by, for example, refining existing models rather than pushing to create even larger models, adds Sasha Luccioni.

The conclusions of the article constitute “a warning sign for people who use this type of model, that is to say most of the time people working for large technology companies”, considers David Rolnick, assistant professor at the computer science school of McGill University and at the Quebec Institute of Artificial Intelligence (Mila). With Lynn Kaack, he is one of the co-authors of the article published in Nature but he did not participate in the work of Hugging Face.

“The impacts of AI are not inevitable. They result from our choices about how we use these algorithms and the choice of algorithms used”, concludes David Rolnick.

Article by Melissa Heikkilä, translated from English by Kozi Pastakia.


Meet Alex Hanna, the researcher who left Google to save AI

We wish to give thanks to the writer of this post for this amazing web content

How to measure the carbon impact of AI? This start-up thinks it has found the solution

We have our social media profiles here , as well as other pages on related topics here.