Explore our latest thought leadership, ideas, and insights on the issues that are shaping the future of business and society.
Choose a partner with intimate knowledge of your industry and first-hand experience of defining its future.
Discover our portfolio – constantly evolving to keep pace with the ever-changing needs of our clients.
Become part of a diverse collective of free-thinkers, entrepreneurs and experts – and help us to make a difference.
See our latest news, and stories from across the business, and explore our archives.
We are a global leader in partnering with companies to transform and manage their business by harnessing the power of technology.
Our number one ranked think-tank
Explore our brands
Explore our technology partners
Explore careers with our brands
AI offers great technological promise, including in sustainability solutions, but is very power hungry. The question is, how to train and operate AI solutions in a more sustainable way while harvesting the positive impact AI can have.
Today, artificial intelligence (AI) technologies are increasing the effectiveness and productivity of applications by automating manual activities that originally required cognitive capabilities previously considered unique to humans. However, all of this can come at a hefty carbon cost. Building and training an AI language system from scratch, for example, can generate up to 78,000 pounds of CO2 – or twice as much as the average American exhales over their lifetime – while using a different training method to train an AI algorithm can emit as much carbon as five American cars, including manufacturing.
Of course, AI models don’t just consume electricity when they are trained. Once trained and deployed an AI model draws conclusions from collected evidence and previously trained models (called interference). This activity requires processing power and therefore will also consume electricity during operation.
For any AI tool to work, it needs to be “trained.” AI tools (or, more precisely, machine learning algorithms) build mathematical models based on sample data, known as “training data,” to make predictions or decisions without being explicitly programmed to do so. In other words, it is through training that an AI application “becomes intelligent,” and capable, for instance, of identifying a pattern based on how it was trained.
Training an AI tool can be a compute and therefore a power-hungry exercise. In 2018, OpenAI stated that “… since 2012, the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.4-month doubling time (by comparison, Moore’s Law had a two-year doubling period). Since 2012, this metric has grown by more than 300,000 times (a two-year doubling period would yield only a sevenfold increase).”
Advances in AI are mainly achieved through sheer scale: more data, larger models associated with more parameters and/or more compute. In May 2020, OpenAI announced the provision of the biggest AI model in history. Known as GPT-3, it has 75 billion parameters. By comparison, the largest version of GPT-2 had 1.5 billion parameters and the largest transformer-based language model in the world – introduced by Microsoft in May 2020 – is 17 billion parameters.
Relying on increasingly complex models to train more and more sophisticated AI tools can result in increased compute consumption, which in turn, can increase energy consumption and can expand the carbon footprint. As the 2019 study noted, training a single AI model can generate up to 626,155 pounds of CO2 emissions. Today, there are, of course, smaller AI models that generate less carbon output. However, at the time the study was conducted, GPT-2 was the largest model available and it was treated as an upper boundary on model size. Just a year later, GPT-3 is a factor 50 larger than its predecessor.
There is a general consensus that the relationship between model performance and model complexity (measured as number of parameters or inference time) has long been understood to be, at best, logarithmic; for a linear gain in performance, an exponentially larger model is required, resulting at some point in diminishing returns at an increased computational and carbon cost. In other words, if we do not change the way we design, provision, and operate these networks, the carbon impact will increase.
There are three main ways to reduce the carbon footprint of AI:
The ability to measure and report on the CO2 footprint would help increase public awareness of the severity of the problem. In a paper issued in July 2019 by the Allen Institute for AI, the authors proposed using floating point operations as the most universal and useful energy efficiency metric. Another group developed a Machine Learning Calculator, which aims to provide better visibility of the potential CO2 footprint. Also, a team of researchers from Stanford, Facebook AI Research, and McGill University created a tool that measures both how much electricity a machine learning project will use and what that means in terms of carbon emissions. The paper was published in early 2020 and pointed out that using floating point operations (FPOs) might not lead to accurate results.
Earlier this year, researchers from MIT issued a paper contending that their once-for-all (OFA) network could significantly reduce the carbon footprint. According to scienceblog, “Using the system to train a computer-vision model, [the MIT researchers] estimated that the process required roughly 1/1,300 the carbon emissions compared to today’s state-of-the-art neural architecture search approaches.”
Decoupling the divergent relationship between training AI efficiently and compute power by using a different way of building neural networks (for instance, MIT’s OFA network) shows the clear aspiration by the IT industry to reverse the current negative trend.
A much more radical approach was advocated by the AI guru, Geoffrey Hinton, in 2017: “The future depends on some graduate student who is deeply suspicious of everything I have said … My view is throw it all away and start again.” It seems that simply throwing more compute power at more complex AI scenarios is not the way forward. Perhaps we should use the human brain as a reference; comprising only about 2% of a person’s weight, it requires only about 20W to function – barely enough for a lightbulb.
In addition to changing the way we build and run neural networks, using more efficient and more environmentally friendly compute power could also reduce the carbon footprint. According to the United States Environmental Protection Agency, one kilowatt-hour of energy consumption generates 0.954 pounds of CO2 emissions (US-based numbers). This average reflects the varying carbon footprints and relative proportions of different electricity sources across the US (e.g., nuclear, renewables, natural gas, coal). The aforementioned paper by E. Strubell applied this US-wide average to calculate the carbon emissions of various AI models based on their energy needs.
If, however, an AI model were provisioned on a compute platform that was both extremely efficient (a 1.1 or lower PUE rating) and used only electricity produced by renewables (for instance, wind or sun), it would have a far lower CO2 footprint.
Using a compute platform that is fully efficient (using PUE as a measure), uses only renewable electricity, and considers environmental impact (in terms of procurement, provisioning, and disposal) will reduce the overall carbon footprint. Using cloud computing provided by a public cloud provider that operates a fully efficient and sustainable data center network can reduce the CO2 emissions of the existing learning environment.
Because sustainability is becoming a central subject across the entire IT sector, many cloud-providers publicize both their efficiency ratings and their CO2 footprint per data center online, and a substantial number have either already achieved or have clear ambitions to use 100% renewable energy for their global infrastructure.
AI has the potential to deliver significant value to us all. Based on our research, more than 53% of organizations have now moved beyond AI pilots, and are already reaping real and tangible benefits. But AI also holds the promise of far wider, global benefits, as it has the capacity to help reduce worldwide greenhouse gas (GHG) emissions by 4% by 2030. However, that value comes, at least for now, at the expense of CO2 emissions and the ensuant environmental impact. We can mitigate this impact by increasing our overall awareness, changing the way we train AI models, and using more efficient and more environmentally sustainable compute capabilities. Or, as Geoffrey Hinton put it, by going back to the drawing board.
More AI-related material can be found here.
For more in-depth and detailed research related to AI, please see the Capgemini Research Institute’s publications on the subject.
 For instance, repetitive tasks or non valuable tasks These are theoretical figures and in reality, today training an AI tool will probably consume less.
Driven by business impact, passionate about technology and focused on real outcomes, I have successfully led multiple €500m+ Global Accounts, Programs and Projects as CTO, Chief & Enterprise Architect as well as Head of Architecture. With a total of over 6000 Architects, I am one of only 2 globally certified Master Architect and IAF Master across Capgemini Group. I work directly with customers, enabling business driven IT transformation, covering sales to delivery across multiple sectors and technologies.As a certified Master Architect and IAF Master, I am a member of Capgemini’s Global Architecture Board and as a senior leader within Capgemini’s Global Architecture Community, I play a key role in the direction of the architecture profession across the Group and developing our future talent.Recent work includes:CTO/Chief Architect for a number of large Accounts like Heathrow, Schneider Electric, Environment Agency and Learning and Skills CouncilDesigned and planned the migration of 400 applications across 10,000 serversCreated the Technology Strategy for a large UK based Utility companyLed the definition of a new agile based target operating model for a large global bankDesigned and deployed one of the largest Guidewire basesd insurance solutions onto a public cloud platformSince 2005 I held a number of CTO, Chief Architect and Enterprise Architect roles. I am skilled across the full lifecycle from strategy, business development, successfully selling, solutions scoping and shaping through to delivery and running.With over 30 years IT experience, I combine an extensive and deep seated technical understanding covering application, data and infrastructure, with strong consulting, sector, stakeholder management, people management and leadership skills in order to maximize my clients return on investment.I am fluent in English and German. In my spare time I am an avid runner, having completed 23 half and 7 full marathons.I am also a lead author of Capgemini’s TechnoVision trend series.