From European HPC experts pondering “can fast be green?” to new milestones on the Green500 list, sustainability certainly had a moment at the hybrid SC21 conference. And it’s no wonder: the exascale era is here, and power consumption for HPC is skyrocketing even as efficiency is driven to its extremes. At SC21, another session – “HPC’s Growing Sustainability Challenges and Emerging Approaches” – tackled the topic, bringing together experts across the HPC sector to discuss how the rapidly expanding field could control its energy use and emissions.
“The context is – as I’m sure you’re all aware – that there’s aggressive government goals to decarbonize their economies and reduce negative environmental impacts broadly, and computing is sort of in a special place in this,” explained the session’s host, Andrew Chien, a senior scientist at Argonne National Laboratory. “Computing and HPC together have challenges around sustainability because of their rapidly proliferating use and very, very fast technology lifecycles.”
Chien was joined by Steve Hammond, a senior research advisor at the National Renewable Energy Laboratory; Bill Magro, chief technologist for HPC at Google; Michael McNamara, CEO and founder of Lancium; and Erik Riedel, senior vice president of engineering at ITRenew.
HPC has a big footprint.
“It’s actually particularly timely that we have this panel on sustainability,” Chien continued, “because [the international climate conference] COP26 has been going on, and maybe just completed … but entire economies – including of course HPC and computing – have made the commitment to get to net zero by various times, whether it’s 2050, 2060, or 2070.”
HPC, Chien continued, was not an insignificant slice of the pie: using the Top500 as a proxy, Chien said, researchers had estimated the systems’ aggregate energy load at around 600MW, or 5.2 terawatt-hours per year, with the top ten systems constituting around 100MW of that load. This, he said, would account for around two million metric tons of CO2 per year – equivalent to around 285,000 average households, and that only from 500 publicly-ranked systems.
And HPC’s footprint doesn’t stop at energy use (“scope two” emissions). “Scope three” emissions, which include embedded carbon from along the value chain, are also high for the sector. “We have the challenging practice of a fast-moving technology, which is that we dispose of our systems after a relatively short lifetime,” Chien said, noting that there was a “very, very significant amount of energy” corresponding to production of components for HPC systems (along with e-waste, of course).
Choosing when to step matters.
But, it turns out, with a big footprint, choosing when to step can be crucial. Normally, Chien said, computing loads are constant, making it hard to go carbon-free as the power grids hosting those computing loads fluctuate in terms of carbon intensity throughout the day. To that end, Chien detailed a project he had been working on called zero-carbon cloud (ZCCloud).
“Instead of having all of your datacenters as constant loads in the grid,” he said, “we might be able to have computing equipment modulate its consumption to the availability of excess renewable power.”
“If you take this study to its logical conclusion,” he continued, there are some “amazing opportunities.” The team had looked at workloads at Argonne for about a year, examining expected performance, throughput and energy use between a traditional, reliably-powered system and a system powered by intermittent renewable energy. The latter system, Chien said, “effectively eliminates 50 percent of the carbon emissions footprint for this aggregate computing system.”
“This approach – a holistic approach that looks at TCO from capital equipment on through operational costs and power and so on – could lead you to the conclusion that you could build a system with higher throughput per million dollars of TCO per year and … higher peak performance per million dollars of TCO per year. So this is in addition to the 50 percent emissions reduction.”
“If you look at this in the right way,” he concluded, “there may be opportunities that not only reduce the carbon emissions footprint, but also create new opportunities for new capability and even cost-effectiveness.”
Magro, in his talk, agreed with both the stated problem and the proposed solution. “Even if you want to commit to being carbon-free, the problem is there isn’t carbon-free energy available all day,” he said. “The energy supply is very spiky. … If you just present a stable load, then of course you can’t run carbon-free.”
Google, he explained, had moved away from constant loads in many cases as a way of combating this reality. “In our datacenters, we actually align with the availability of low-carbon energy and we compute more when the wind is blowing and when the sun is shining,” he said.
Choosing where to step matters.
But time isn’t the only variable in play – location matters, too. Hammond, who helps to manage the world’s most energy-efficient datacenter at NREL’s campus in Golden, Colorado, detailed the wide range of efficiency choices made to give the datacenter its stunning PUE of 1.036… with one major caveat. “No one approach fits every solution wherever you are,” he said. “What works for us here in Colorado may be different from what works in the Southeastern U.S.”
This, as it turns out, was the entire pitch for another speaker – McNamara – who flipped that idea on its head: what if regionally-specific sustainability considerations played a major role in siting workloads?
McNamara first outlined the characteristics of the modern (and future) power grid. First, he said, solar and wind were getting overbuilt to compensate for their intermittency; second, generation was no longer necessarily colocated to urban centers; and third, as a result of these trends, negative-priced energy – energy that renewable generators needed to offload, but for which there was not demand – was a growing problem.
“Negative-priced energy is now endemic,” he said. “And it’s endemic because the grid of the future has too much wind and solar. … The way to solve this problem is to build, at extremely large scale, many datacenters at critical points on the transmission system in every grid operator that is going decarbonized. Which is all of them.”
This, he said, was the pitch for his company, Lancium. “Our vision is: datacenters should act as enormous inverse power plants,” he said. “What power plants do is, when the grid is in times of need, power plants go up. The grid is indifferent if a power plant goes up or a datacenter goes down. … They consume negative-priced energy; the provide the grid inertia that the grid needs because of the retirement of fossil plants; and the net effect, if you run these in a highly flexible manner, is negative eight million tons [of carbon] a year if you build 5000MW of datacenters. And that’s in Texas alone. And that’s what we’re doing.”
McNamara said that this solution – which could, ostensibly, run massive workloads at zero energy cost – was preferable to costly, time-consuming transmission line upgrades and more foolproof than current storage solutions, which he said were good for about four hours. Further, he explained that costly sustainability solutions were not necessary in Lancium’s datacenters, which ran hot and “moved a ton of air.”
But all of that, of course, only addresses the scope two emissions. Scope three emissions were an entirely different beast: “It boggles my mind how much we just have to throw away when we get a new system,” Hammond said. Riedel explained how his company, ITRenew, worked to decommission datacenter equipment for companies like Facebook, Google and Twitter, shredding hard drives and repurposing other equipment to give it a second life.
Chien, earlier in the panel, had pitched this same notion, outlining how datacenter operators could deploy phased-out hardware at a secondary facility with low-cost, on-site renewable energy for another few years. This, he said, would be a win-win-win: low-cost equipment, low-cost energy and lower (annualized) embedded emissions thanks to longer lifetimes for the components.
More sustainability coverage from SC21