Jump to content

Performance per watt: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Update Green500 external link
acs-link surgery
 
(37 intermediate revisions by 29 users not shown)
Line 1: Line 1:
{{Short description|Computer energy efficiency}}
{{Short description|Computer energy efficiency}}
{{redirect-distinguish|Power efficiency|Mechanical efficiency}}
In [[computing]], '''performance per watt''' is a measure of the [[Energy conversion efficiency|energy efficiency]] of a particular [[computer architecture]] or [[computer hardware]]. Literally, it measures the rate of computation that can be delivered by a computer for every [[watt]] of power consumed. This rate is typically measured by performance on the [[LINPACK]] benchmark when trying to compare between computing systems.
{{use dmy dates |date=February 2021}}

In [[computing]], '''performance per watt''' is a measure of the [[Energy conversion efficiency|energy efficiency]] of a particular [[computer architecture]] or [[computer hardware]]. Literally, it measures the rate of computation that can be delivered by a computer for every [[watt]] of power consumed. This rate is typically measured by performance on the [[LINPACK]] benchmark when trying to compare between computing systems: an example using this is the [[Green500]] list of supercomputers. Performance per watt has been suggested to be a more sustainable measure of computing than [[Moore's law|Moore's Law]].<ref>{{Cite web|last1=Aitken|first1=Rob|last2=Fellow|last3=Technology|first3=Director of|last4=Arm|date=2021-07-12|title=Performance per Watt Is the New Moore's Law|url=https://www.arm.com/blogs/blueprint/performance-per-watt|access-date=2021-07-16|website=Arm Blueprint|language=en-US}}</ref>


System designers building [[parallel computing|parallel computers]], such as [[Google search technology#Production hardware|Google's hardware]], pick CPUs based on their performance per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself.<ref>[http://news.cnet.com/Power+could+cost+more+than+servers,+Google+warns/2100-1010_3-5988090.html Power could cost more than servers, Google warns], CNET, 2006</ref>
System designers building [[parallel computing|parallel computers]], such as [[Google search technology#Production hardware|Google's hardware]], pick CPUs based on their performance per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself.<ref>[http://news.cnet.com/Power+could+cost+more+than+servers,+Google+warns/2100-1010_3-5988090.html Power could cost more than servers, Google warns], CNET, 2006</ref>

Spaceflight computers have hard limits on the maximum power available and also have hard requirements on minimum real-time performance. A ratio of processing speed to required electrical power is more useful than raw processing speed.<ref name="sc-7">
D. J. Shirley; and M. K. McLelland.
[https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=2656&context=smallsat "The Next-Generation SC-7 RISC Spaceflight Computer"].
p. 1, 2.
</ref>


==Definition==
==Definition==
The performance and power consumption metrics used depend on the definition; reasonable measures of performance are [[FLOPS]], [[Instructions per second|MIPS]], or the score for any [[Benchmark (computing)|performance benchmark]]. Several measures of power usage may be employed, depending on the purposes of the metric; for example, a metric might only consider the electrical power delivered to a machine directly, while another might include all power necessary to run a computer, such as cooling and monitoring systems. The power measurement is often the average power used while running the benchmark, but other measures of power usage may be employed (e.g. peak power, idle power).
The performance and power consumption metrics used depend on the definition; reasonable measures of performance are [[FLOPS]], [[Instructions per second|MIPS]], or the score for any [[Benchmark (computing)|performance benchmark]]. Several measures of power usage may be employed, depending on the purposes of the metric; for example, a metric might only consider the electrical power delivered to a machine directly, while another might include all power necessary to run a computer, such as cooling and monitoring systems. The power measurement is often the average power used while running the benchmark, but other measures of power usage may be employed (e.g. peak power, idle power).


For example, the early [[UNIVAC I]] computer performed approximately 0.015 operations per watt-second (performing 1,905 operations per second (OPS), while consuming 125&nbsp;kW). The [[Fujitsu]] [[FR-V]] [[VLIW]]/[[vector processor]] [[system on a chip]] in the 4 FR550 core variant released 2005 performs 51 Giga-OPS with 3 watts of power consumption resulting in 17 billion operations per watt-second.<ref>[http://www.fujitsu.com/global/news/pr/archives/month/2005/20050207-01.html Fujitsu Develops Multi-core Processor for High-Performance Digital Consumer Products] Fujitsu</ref><ref>[http://www.fujitsu.com/downloads/MAG/vol42-2/paper03.pdf FR-V Single-Chip Multicore Processor:FR1000] {{webarchive|url=https://web.archive.org/web/20150402150434/http://www.fujitsu.com/downloads/MAG/vol42-2/paper03.pdf |date=2015-04-02 }} Fujitsu</ref> This is an improvement by over a trillion times in 54 years.
For example, the early [[UNIVAC I]] computer performed approximately 0.015 operations per watt-second (performing 1,905 operations per second (OPS), while consuming 125&nbsp;kW). The [[Fujitsu]] [[FR-V]] [[VLIW]]/[[vector processor]] [[system on a chip]] in the 4 FR550 core variant released 2005 performs 51 Giga-OPS with 3 watts of power consumption resulting in 17 billion operations per watt-second.<ref>{{cite press release |author=<!--Staff writer(s); no by-line.--> |date=2020-02-07 |title=Fujitsu Develops Multi-core Processor for High-Performance Digital Consumer Products |url=https://www.fujitsu.com/global/about/resources/news/press-releases/2005/0207-01.html |url-status=live |publisher=Fujitsu |archive-url=https://web.archive.org/web/20190325233011/http://www.fujitsu.com/global/about/resources/news/press-releases/2005/0207-01.html |archive-date=2019-03-25 |access-date=2020-08-08}}</ref><ref>[http://www.fujitsu.com/downloads/MAG/vol42-2/paper03.pdf FR-V Single-Chip Multicore Processor:FR1000] {{webarchive|url=https://web.archive.org/web/20150402150434/http://www.fujitsu.com/downloads/MAG/vol42-2/paper03.pdf |date=2015-04-02 }} Fujitsu</ref> This is an improvement by over a trillion times in 54 years.


Most of the power a computer uses is converted into heat, so a system that takes fewer watts to do a job will require less cooling to maintain a given [[operating temperature]]. Reduced cooling demands makes it easier to [[quiet PC|quiet a computer]]. Lower energy consumption can also make it less costly to run, and reduce the environmental impact of powering the computer (see [[green computing]]).
Most of the power a computer uses is converted into heat, so a system that takes fewer watts to do a job will require less cooling to maintain a given [[operating temperature]]. Reduced cooling demands makes it easier to [[quiet PC|quiet a computer]]. Lower energy consumption can also make it less costly to run, and reduce the environmental impact of powering the computer (see [[green computing]]).
Line 14: Line 23:
Computing energy consumption is sometimes also measured by reporting the energy required to run a particular benchmark, for instance [[EEMBC]] EnergyBench. Energy consumption figures for a standard workload may make it easier to judge the effect of an improvement in [[Electrical efficiency|energy efficiency]].
Computing energy consumption is sometimes also measured by reporting the energy required to run a particular benchmark, for instance [[EEMBC]] EnergyBench. Energy consumption figures for a standard workload may make it easier to judge the effect of an improvement in [[Electrical efficiency|energy efficiency]].


Performance (in operations/second) per watt can also be written as operations/watt-second, or operations/joule, since 1 watt = 1 joule/second.
When performance is defined as {{Sfrac|operations|[[second]]}}, then performance per watt can be written as {{Sfrac|operations|[[watt-second]]}}. Since a watt is one {{Sfrac|[[joule]]|second}}, then performance per watt can also be written as {{Sfrac|operations|joule}}.


==FLOPS per watt==
==FLOPS per watt==
Line 22: Line 31:


===Examples===
===Examples===
{{As of|2016|06}}, the [[#Green500_List|Green500]] list rates the two most efficient supercomputers highest{{snd}} those are both based on the same [[manycore]] accelerator [[PEZY-SCnp]] Japanese technology in addition to Intel Xeon processors<!--computers both named ZettaScaler-1.6-->{{snd}} both at [[RIKEN]], the top one at 6673.8&nbsp;MFLOPS/watt; and the third ranked is the Chinese-technology [[Sunway TaihuLight]] (a much bigger machine, that is the ranked 2nd on [[TOP500]], the others are not on that list) at 6051.3&nbsp;MFLOPS/watt.<ref>{{cite web |url=https://www.top500.org/green500/lists/2016/06/ |title=Green500 List for June 2016}}</ref>
{{As of|2016|06}}, the Green500 list rates the two most efficient supercomputers highest{{snd}} those are both based on the same [[manycore]] accelerator [[PEZY-SCnp]] Japanese technology in addition to Intel Xeon processors<!--computers both named ZettaScaler-1.6-->{{snd}} both at [[RIKEN]], the top one at 6673.8&nbsp;MFLOPS/watt; and the third ranked is the Chinese-technology [[Sunway TaihuLight]] (a much bigger machine, that is the ranked 2nd on [[TOP500]], the others are not on that list) at 6051.3&nbsp;MFLOPS/watt.<ref>{{cite web |url=https://www.top500.org/green500/lists/2016/06/ |title=Green500 List for June 2016}}</ref>


In June 2012, the Green500 list rated [[Blue Gene/Q|BlueGene/Q, Power BQC 16C]] as the most efficient supercomputer on the TOP500 in terms of FLOPS per watt, running at 2,100.88&nbsp;MFLOPS/watt.<ref>{{cite web | url = http://www.green500.org/lists/2012/06/top/list.php | work = Green500 | title = The Green500 List | url-status = dead | archiveurl = https://web.archive.org/web/20120703053827/http://www.green500.org/lists/2012/06/top/list.php | archivedate = 2012-07-03 }}</ref>
In June 2012, the Green500 list rated [[Blue Gene/Q|BlueGene/Q, Power BQC 16C]] as the most efficient supercomputer on the TOP500 in terms of FLOPS per watt, running at 2,100.88&nbsp;MFLOPS/watt.<ref>{{cite web | url = http://www.green500.org/lists/2012/06/top/list.php | work = Green500 | title = The Green500 List | url-status = dead | archive-url = https://web.archive.org/web/20120703053827/http://www.green500.org/lists/2012/06/top/list.php | archive-date = 2012-07-03 }}</ref>


In November 2010, IBM machine, [[IBM Blue Gene#Blue Gene/Q|Blue Gene/Q]] achieves 1,684&nbsp;MFLOPS/watt.<ref>{{cite web| url=http://www.serverwatch.com/hreviews/article.php/3913536/Top500-Supercomputing-List-Reveals-Computing-Trends.htm| title = Top500 Supercomputing List Reveals Computing Trends| date = 20 July 2010|quote=IBM... BlueGene/Q system .. setting a record in power efficiency with a value of 1,680&nbsp;Mflops/watt, more than twice that of the next best system.}}</ref><ref>{{cite web| url=http://www.datacenterknowledge.com/archives/2010/11/18/ibm-system-clear-winner-in-green-500/|title = IBM Research A Clear Winner in Green 500|date = 2010-11-18}}</ref>
On 9 June 2008, CNN reported that [[IBM Roadrunner|IBM's Roadrunner]] supercomputer achieves 376&nbsp;MFLOPS/watt.<ref>{{cite news | url = http://www.cnn.com/2008/TECH/06/09/fastest.computer.ap/index.html | work = CNN | title = Government unveils world's fastest computer |quote= performing 376 million calculations for every watt of electricity used. |archiveurl = https://web.archive.org/web/20080610155646/http://www.cnn.com/2008/TECH/06/09/fastest.computer.ap/index.html |archivedate = 2008-06-10}}</ref><ref>{{cite web|url = http://www.hpcwire.com/topic/processors/IBM_Roadrunner_Takes_the_Gold_in_the_Petaflop_Race.html|title = IBM Roadrunner Takes the Gold in the Petaflop Race|url-status = dead|archiveurl = https://web.archive.org/web/20080613131535/http://www.hpcwire.com/topic/processors/IBM_Roadrunner_Takes_the_Gold_in_the_Petaflop_Race.html|archivedate = 2008-06-13}}</ref>


On 9 June 2008, CNN reported that [[IBM Roadrunner|IBM's Roadrunner]] supercomputer achieves 376&nbsp;MFLOPS/watt.<ref>{{cite news | url = http://www.cnn.com/2008/TECH/06/09/fastest.computer.ap/index.html | work = CNN | title = Government unveils world's fastest computer |quote= performing 376 million calculations for every watt of electricity used. |archive-url = https://web.archive.org/web/20080610155646/http://www.cnn.com/2008/TECH/06/09/fastest.computer.ap/index.html |archive-date = 2008-06-10}}</ref><ref>{{cite web|url = http://www.hpcwire.com/topic/processors/IBM_Roadrunner_Takes_the_Gold_in_the_Petaflop_Race.html|title = IBM Roadrunner Takes the Gold in the Petaflop Race|url-status = dead|archive-url = https://web.archive.org/web/20080613131535/http://www.hpcwire.com/topic/processors/IBM_Roadrunner_Takes_the_Gold_in_the_Petaflop_Race.html|archive-date = 2008-06-13}}</ref>
In November 2010, IBM machine, [[IBM Blue Gene#Blue Gene/Q|Blue Gene/Q]] achieves 1,684&nbsp;MFLOPS/watt.<ref>{{cite web| url=http://www.serverwatch.com/hreviews/article.php/3913536/Top500-Supercomputing-List-Reveals-Computing-Trends.htm| title = Top500 Supercomputing List Reveals Computing Trends|quote=IBM... BlueGene/Q system .. setting a record in power efficiency with a value of 1,680&nbsp;Mflops/watt, more than twice that of the next best system.}}</ref><ref>{{cite web| url=http://www.datacenterknowledge.com/archives/2010/11/18/ibm-system-clear-winner-in-green-500/|title = IBM Research A Clear Winner in Green 500|date = 2010-11-18}}</ref>


As part of [[Intel]]'s [[Intel Tera-Scale|Tera-Scale]] research project, the team produced an 80-core CPU that can achieve over 16,000&nbsp;MFLOPS/watt.<ref>{{cite web | url = http://www.tgdaily.com/content/view/30929/135/ | work = TG Daily | title = Intel squeezes 1.8 TFlops out of one processor | url-status = dead | archiveurl = https://web.archive.org/web/20071203023419/http://www.tgdaily.com/content/view/30929/135/ | archivedate = 2007-12-03 }}</ref><ref>{{cite web | url = http://techresearch.intel.com/articles/Tera-Scale/1449.htm | work = Intel Technology and Research | title = Teraflops Research Chip}}</ref> The future of that CPU is not certain.
As part of the [[Intel Tera-Scale]] research project, the team produced an 80-core CPU that can achieve over 16,000&nbsp;MFLOPS/watt.<ref>{{cite web | url = http://www.tgdaily.com/content/view/30929/135/ | work = TG Daily | title = Intel squeezes 1.8 TFlops out of one processor | url-status = dead | archive-url = https://web.archive.org/web/20071203023419/http://www.tgdaily.com/content/view/30929/135/ | archive-date = 2007-12-03 }}</ref><ref>{{cite web | url = http://techresearch.intel.com/articles/Tera-Scale/1449.htm | work = Intel Technology and Research | title = Teraflops Research Chip}}</ref> The future of that CPU is not certain.


Microwulf, a low cost desktop [[Beowulf (computing)|Beowulf cluster]] of four dual-core [[Athlon 64 X2]] 3800+ computers, runs at 58&nbsp;MFLOPS/watt.<ref name="microwulf">{{cite web | url = http://www.calvin.edu/~adams/research/microwulf/power/ | title = Microwulf: Power Efficiency | author = Joel Adams | work = Microwulf: A Personal, Portable Beowulf Cluster }}</ref>
Microwulf, a low cost desktop [[Beowulf (computing)|Beowulf cluster]] of four dual-core [[Athlon 64 X2]] 3800+ computers, runs at 58&nbsp;MFLOPS/watt.<ref name="microwulf">{{cite web | url = http://www.calvin.edu/~adams/research/microwulf/power/ | title = Microwulf: Power Efficiency | author = Joel Adams | work = Microwulf: A Personal, Portable Beowulf Cluster }}</ref>
Line 36: Line 45:
Kalray has developed a 256-core VLIW CPU that achieves 25,000&nbsp;MFLOPS/watt. Next generation is expected to achieve 75,000&nbsp;MFLOPS/watt.<ref>{{cite web | url = http://www.kalray.eu/products/mppa-manycore/ | title = MPPA MANYCORE - Many-core processors - KALRAY - Agile Performance}}</ref> However, in 2019 their latest chip for embedded is 80-core and claims up to 4&nbsp;TFLOPS at 20&nbsp;W.<ref>{{Cite web|url=https://www.kalray.eu/kalray-announces-the-tape-out-of-coolidge-on-tsmc-16nm-process-technology/|title=Kalray announces the Tape-Out of Coolidge on TSMC 16NM process technology|date=2019-07-31|website=Kalray|language=en-US|access-date=2019-08-12}}</ref>
Kalray has developed a 256-core VLIW CPU that achieves 25,000&nbsp;MFLOPS/watt. Next generation is expected to achieve 75,000&nbsp;MFLOPS/watt.<ref>{{cite web | url = http://www.kalray.eu/products/mppa-manycore/ | title = MPPA MANYCORE - Many-core processors - KALRAY - Agile Performance}}</ref> However, in 2019 their latest chip for embedded is 80-core and claims up to 4&nbsp;TFLOPS at 20&nbsp;W.<ref>{{Cite web|url=https://www.kalray.eu/kalray-announces-the-tape-out-of-coolidge-on-tsmc-16nm-process-technology/|title=Kalray announces the Tape-Out of Coolidge on TSMC 16NM process technology|date=2019-07-31|website=Kalray|language=en-US|access-date=2019-08-12}}</ref>


[[Adapteva]] announced the [[Adapteva#Epiphany V|Epiphany V]], a 1024-core 64-bit RISC processor intended to achieve 75&nbsp;GFLOPS/watt,<ref>{{cite web|last1=Olofsson|first1=Andreas|title=Epiphany-V: A 1024-core 64-bit RISC processor|url=https://www.parallella.org/2016/10/05/epiphany-v-a-1024-core-64-bit-risc-processor/|accessdate=6 October 2016}}</ref><ref>{{cite web|last1=Olofsson|first1=Andreas|title=Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip|url=https://www.parallella.org/wp-content/uploads/2016/10/e5_1024core_soc.pdf|accessdate=6 October 2016}}</ref> while they later announced that the Epiphany&nbsp;V was "unlikely" to become available as a commercial product
[[Adapteva]] announced the [[Adapteva#Epiphany V|Epiphany V]], a 1024-core 64-bit RISC processor intended to achieve 75&nbsp;GFLOPS/watt,<ref>{{cite web|last1=Olofsson|first1=Andreas|title=Epiphany-V: A 1024-core 64-bit RISC processor|url=https://www.parallella.org/2016/10/05/epiphany-v-a-1024-core-64-bit-risc-processor/|access-date=6 October 2016}}</ref><ref>{{cite web|last1=Olofsson|first1=Andreas|title=Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip|url=https://www.parallella.org/wp-content/uploads/2016/10/e5_1024core_soc.pdf|access-date=6 October 2016}}</ref> while they later announced that the Epiphany&nbsp;V was "unlikely" to become available as a commercial product


US Patent [http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=matteo&s2=gravina&OS=matteo+AND+gravina&RS=matteo+AND+gravina 10,020,436], July 2018 claims three intervals of 100, 300, and 600 GFLOPS/watt.
US Patent [http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=matteo&s2=gravina&OS=matteo+AND+gravina&RS=matteo+AND+gravina 10,020,436], July 2018 claims three intervals of 100, 300, and 600 GFLOPS/watt.

==Green500 List==
The Green500 list ranks computers from the [[TOP500]] list of supercomputers in terms of [[Electrical efficiency|energy efficiency]], typically measured as [[LINPACK]] FLOPS per watt.<ref>{{cite web | url = https://www.top500.org/green500/ | title = The Green 500}}</ref><ref>{{cite web | url = http://www.itnews.com.au/News/65619,green-500-list-ranks-supercomputers.aspx | work = iTnews Australia | title = Green 500 list ranks supercomputers | url-status = dead | archiveurl = https://web.archive.org/web/20081022193316/http://www.itnews.com.au/News/65619,green-500-list-ranks-supercomputers.aspx | archivedate = 2008-10-22 }}</ref>

{{As of|2012|11}}, an [[Appro International, Inc.]] Xtreme-X supercomputer (''[[Beacon (supercomputer)|Beacon]]'') topped the Green500 list with 2499 LINPACK MFLOPS/W.<ref name="nics green500">{{cite web|title=University of Tennessee Supercomputer Sets World Record for Energy Efficiency|url=https://www.nics.tennessee.edu/beacon|work=National Institute for Computational Sciences News|publisher=University of Tennessee & Oak Ridge National Laboratory|accessdate=21 November 2012}}</ref> Beacon is deployed by NICS of the University of Tennessee and is a GreenBlade GB824M, [[Xeon#E5-16xx/24xx/26xx/46xx-series "Sandy Bridge-EP"|Xeon E5-2670]] based, eight cores (8C), 2.6&nbsp;GHz, Infiniband FDR, Intel Xeon Phi 5110P computer.<ref name="top green500">{{cite web|title=Beacon - Appro GreenBlade - Green500 list|url=http://www.top500.org/system/177997|publisher=top500.org|accessdate=21 November 2012}}</ref>

{{As of|2013|06}}, the [[Eurotech (company)|Eurotech]] supercomputer Eurora at [[Cineca]] topped the Green500 list with 3208 LINPACK MFLOPS/W.<ref name="Eurotech Eurora tops green 500">{{cite web|title=Eurotech Eurora, the PRACE prototype deployed by Cineca and INFN, scores first in Green500 list|url=http://www.hpc.cineca.it/news/eurotech-eurora-prace-prototype-deployed-cineca-and-infn-scores-first-green500-list|work=Cineca|publisher=Cineca|accessdate=28 June 2013}}</ref> The Cineca Eurora supercomputer is equipped with two Intel Xeon E5-2687W CPUs and two PCI-e connected NVIDIA Tesla K20 accelerators per node. Water cooling and electronics design allows for very high densities to be reached with a peak performance of 350 TFLOPS per rack.<ref name="top green500Aurora">{{cite web|title=Eurora - Aurora Tigon - Top500 list|url=http://www.top500.org/system/178077|publisher=top500.org|accessdate=28 June 2013}}</ref>

{{As of|2014|11}}, the L-CSC supercomputer of the [[Helmholtz Association of German Research Centres|Helmholtz Association]] at the [[Gesellschaft für Schwerionenforschung|GSI]] in [[Darmstadt]] Germany topped the Green500 list with 5271&nbsp;MFLOPS/W and was the first cluster to surpass an efficiency of 5&nbsp;GFLOPS/W. It runs on [[List of Intel Xeon microprocessors#Xeon E5-26xx v2 (dual-processor).5B32.5D|Intel Xeon E5-2690]] Processors with the [[Ivy Bridge (microarchitecture)|Intel Ivy Bridge]] Architecture and [[List of AMD graphics processing units#FirePro Server Series|AMD FirePro S9150]] GPU Accelerators. It uses in rack watercooling and [[Cooling Tower]]s to reduce the energy required for cooling.<ref name="The Green500 List - November 2014">{{cite web|title=The Green500 List - November 2014|url=http://www.green500.org/news/green500-list-november-2014|url-status=dead|archiveurl=https://web.archive.org/web/20150222022522/http://www.green500.org/news/green500-list-november-2014|archivedate=2015-02-22}}</ref>

{{As of|2015|08}}, the [[Shoubu supercomputer]] of [[RIKEN]] outside [[Tokyo]] Japan topped the Green500 list with 7032&nbsp;MFLOPS/W. The then-top three supercomputers of the list used PEZY-SC accelerators ([[general-purpose computing on graphics processing units|GPU]]-like that use [[OpenCL]])<ref>{{Cite web|url=https://streamhpc.com/blog/2015-08-02/the-knowns-and-unknowns-of-the-pezy-sc-accelerator-at-riken/|title=The knowns and unknowns of the PEZY-SC accelerator at RIKEN|last=Hindriksen|first=Vincent|date=2015-08-02|website=StreamHPC|language=en-GB|access-date=2019-10-21}}</ref> by [[PEZY Computing]] with 1024 cores each and 6–7&nbsp;GFLOPS/W efficiency.<ref>{{cite web|url=http://www.hpcwire.com/2015/08/04/japan-takes-top-three-spots-on-green500-list/|title=Japan Takes Top Three Spots on Green500 List|last=Tiffany |first=Tiffany |date=August 4, 2015|publisher=HPCWire|accessdate=8 January 2016}}</ref><ref>{{cite web|url=http://insidehpc.com/2015/09/pezy-exascaler-step-up-on-the-green500-list-with-immersive-cooling/|title=PEZY & ExaScaler Step Up on the Green500 List with Immersive Cooling|date=September 23, 2015|publisher=InsideHPC|accessdate=8 January 2016}}</ref>

{{As of|2019|06}}, DGX SaturnV Volta, using "NVIDIA DGX-1 [[Volta (microarchitecture)|Volta]]36, Xeon E5-2698v4 20C 2.2GHz, Infiniband EDR, [[NVIDIA Tesla]] V100", tops Green500 list with 15,113&nbsp;MFLOPS/W, while ranked only 469th on Top500.<ref>{{Cite web|url=https://www.top500.org/green500/lists/2019/06/|title=June 2019 {{!}} TOP500 Supercomputer Sites|website=www.top500.org|access-date=2019-08-12}}</ref> It's only a little bit more efficient than the much bigger [[Summit (supercomputer)|Summit]] ranked 2nd while 1st on Top500 with 14,719&nbsp;MFLOPS/W, using IBM [[POWER9]] CPUs while also with Nvidia Tesla V100 GPUs.

{| class="wikitable sortable" style="width:100%;font-size:98%;"
|+Top 10 positions of GREEN500 in November 2019<ref>{{cite web|title=November 2019|url=https://www.top500.org/green500/lists/2019/11/|website=www.top500.org|language=en|access-date=2019-12-13|date=}}</ref>
|-
! scope=col data-sort-type="number" | Rank
! scope=col width="50" data-sort-type="number" align="center" | Performance<br/>per Watt<br/><sup>([[FLOPS|GFLOPS]]/[[Watt]])</sup><br/>
! scope=col class="unsortable" | Name
! scope=col | Model<br/>Processors, Interconnect
! scope=col width="40" align="center" | Vendor
! scope=col class="unsortable" | Site<br/>Country, year
! scope=col width="50" data-sort-type="number" align="center" | Rmax<br/><sup>([[FLOPS|PFLOPS]])</sup>
|-
! 1
| {{formatnum:16.876}}
| ''A64FX prototype''
| '''Fujitsu A64FX'''<br/> Fujitsu A64FX 48C 2GHz, Tofu interconnect D
|[[Fujitsu]]
|[[Numazu]]<br/>&nbsp;{{JPN}}, 2018
| {{formatnum:1.999}}
|-
! 2
| {{formatnum:16.256}}
| ''NA-1''
| '''ZettaScaler-2.2'''<br/> Xeon D-1571 16C 1.3GHz, [[Infiniband]] EDR, PEZY-SC2 700Mhz
|PEZY Computing K.K.
|JAMSTEC Yokohama Institute for Earth Sciences, [[Yokohama]]<br/>&nbsp;{{JPN}}, 2019
| {{formatnum:1.303}}
|-
! 3
| {{formatnum:15.771}}
| ''AiMOS''
| '''IBM Power System AC922'''<br/> IBM POWER9 20C 3.45GHz, Dual-rail [[Mellanox Technologies|Mellanox]] EDR Infiniband, NVIDIA Volta GV100
|[[IBM]]
|[[Rensselaer Polytechnic Institute]], [[Troy, New York|Troy]],<br/>{{USA}}, 2018
| {{formatnum:8.045}}
|-
! 4
| {{formatnum:15.574}}
| ''Satori''
| '''IBM Power System AC922'''<br/> IBM POWER9 20C 2.4GHz, Infiniband EDR, NVIDIA Tesla V100 SXM2
|[[IBM]]
|MIT/MGHPCC, [[Holyoke, Massachusetts]],<br/>{{USA}},2018
| {{formatnum:1.464}}
|-
! 5
| {{formatnum:14.719}}
| ''[[Summit (supercomputer)|Summit]]''
| '''[[IBM POWER microprocessors|IBM Power]] System AC922'''<br/> IBM POWER9 22C 3.07GHz, NVIDIA Volta GV100, Dual-rail [[Mellanox]] EDR Infiniband
|[[International Business Machines|IBM]]
|[[Oak Ridge National Laboratory]], [[Oak Ridge, Tennessee]]<br/>&nbsp;{{USA}}, 2018
| {{formatnum:148.600}}
|-
! 6
| {{formatnum:14.423}}
| ''[[AI Bridging Cloud Infrastructure|AI Bridging Cloud Infrastructure (ABCI)]]''
| '''Primergy CX2570 M4'''<br />[[Xeon|Xeon Gold]], [[Nvidia Tesla|Tesla V100 SXM2]],Infiniband EDR
| [[Fujitsu]]
| Joint Center for Advanced High Performance Computing, [[Kashiwa]]<br />&nbsp;{{JPN}}, 2018
| {{formatnum:19.880}}
|-
! 7
| {{formatnum:14.131}}
| '' [[MareNostrum |MareNostrum P9 CTE]]''
| '''IBM Power System AC922'''<br/> IBM POWER9 22C 3.1GHz, Dual-rail Mellanox EDR Infiniband, NVIDIA Tesla V100
|[[IBM]]
|[[Barcelona Supercomputing Center]], [[Barcelona]],<br/>&nbsp;{{ESP}},2019
| {{formatnum:1.145}}
|-
! 8
| {{formatnum:13.704}}
| ''[[Tsubame (supercomputer)|TSUBAME3.0]]''
| '''SGI ICE XA'''<br/>IP139-SXM2, Xeon E5-2680v4 14C 2.4GHz, Intel Omni-Path, NVIDIA Tesla P100 SXM2
|[[Hewlett-Packard]]
|[[Tokyo Institute of Technology]], [[Tokyo]],<br/>&nbsp;{{JPN}},2017
| {{formatnum:8.045}}
|-
! 9
| {{formatnum:13.065}}
| ''PANGEA III''
| '''IBM Power System AC922'''<br/> A III - IBM Power System AC922, IBM POWER9 18C 3.45GHz, Dual-rail Mellanox EDR Infiniband, NVIDIA Volta GV100
|[[IBM]]
| [[Total S.A.]], [[Pau, Pyrénées-Atlantiques|Pau]],<br/>&nbsp;{{FRA}},2019
| {{formatnum:17.860}}
|-
! 10
| {{formatnum:12.723}}
| ''[[Sierra (supercomputer)|Sierra]]''
| '''IBM Power System AC922'''<br/> - IBM Power System AC922, IBM POWER9 22C 3.1GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband
|[[IBM]]
|[[Lawrence Livermore National Laboratory]], [[Livermore, California|Livermore]],<br/>&nbsp;{{USA}}, 2018
| {{formatnum:94.640}}
|}


==GPU efficiency==
==GPU efficiency==
[[Graphics processing unit]]s (GPU) have continued to increase in energy usage, while CPUs designers have recently focused on improving performance per watt. High performance GPUs may draw large amount of power and hence, intelligent techniques are required to manage GPU power consumption.<ref>Mittal et al., "[https://www.academia.edu/6644474/A_Survey_of_Methods_For_Analyzing_and_Improving_GPU_Energy_Efficiency A Survey of Methods for Analyzing and Improving GPU Energy Efficiency]", ACM Computing Surveys, 2015.</ref> Measures like [[3DMark|3DMark2006 score]] per watt can help identify more efficient GPUs.<ref>{{cite web | url=http://www.codinghorror.com/blog/archives/000662.html | title=Video Card Power Consumption | first=Jeff |last=Atwood | date=2006-08-18}}</ref> However that may not adequately incorporate efficiency in typical use, where much time is spent doing less demanding tasks.<ref>{{cite web | url = http://www.xbitlabs.com/articles/video/display/power-noise.html | title = Video card power consumption | work = Xbit Labs | url-status = dead | archiveurl = https://web.archive.org/web/20110904054636/http://www.xbitlabs.com/articles/video/display/power-noise.html | archivedate = 2011-09-04 }}</ref>
[[Graphics processing unit]]s (GPU) have continued to increase in energy usage, while CPUs designers have recently{{When|date=August 2024}} focused on improving performance per watt. High performance GPUs may draw large amount of power, therefore intelligent techniques are required to manage GPU power consumption. Measures like [[3DMark|3DMark2006 score]] per watt can help identify more efficient GPUs.<ref>{{cite web | url=http://www.codinghorror.com/blog/archives/000662.html | title=Video Card Power Consumption | first=Jeff | last=Atwood | date=2006-08-18 | access-date=26 March 2008 | archive-date=8 September 2008 | archive-url=https://web.archive.org/web/20080908060043/http://www.codinghorror.com/blog/archives/000662.html | url-status=dead }}</ref> However that may not adequately incorporate efficiency in typical use, where much time is spent doing less demanding tasks.<ref>{{cite web | url = http://www.xbitlabs.com/articles/video/display/power-noise.html | title = Video card power consumption | work = Xbit Labs | url-status = dead | archive-url = https://web.archive.org/web/20110904054636/http://www.xbitlabs.com/articles/video/display/power-noise.html | archive-date = 2011-09-04 }}</ref>


With modern GPUs, energy usage is an important constraint on the maximum computational capabilities that can be achieved. GPU designs are usually highly scalable, allowing the manufacturer to put multiple chips on the same video card, or to use multiple video cards that work in parallel. Peak performance of any system is essentially limited by the amount of power it can draw and the amount of heat it can dissipate. Consequently, performance per watt of a GPU design translates directly into peak performance of a system that uses that design.
With modern GPUs, energy usage is an important constraint on the maximum computational capabilities that can be achieved. GPU designs are usually highly scalable, allowing the manufacturer to put multiple chips on the same video card, or to use multiple video cards that work in parallel. Peak performance of any system is essentially limited by the amount of power it can draw and the amount of heat it can dissipate. Consequently, performance per watt of a GPU design translates directly into peak performance of a system that uses that design.
Line 153: Line 57:


==Challenges==
==Challenges==
{{Missing information|section|inflationary effect of low clock and power limits, e.g.;<ref>{{Cite web|url=https://www.reddit.com/r/hardware/comments/k3iobs/psa_performance_doesnt_scale_linearly_with/|title=PSA: Performance Doesn't Scale Linearly with Wattage (Aka testing M1 versus a Zen 3 5600X at the same Power Draw)|date=29 November 2020}}</ref> also Energy/Frequency Convexity Rule|date=November 2020}}
While performance per watt is useful, absolute power requirements are also important. Claims of improved performance per watt may be used to mask increasing power demands. For instance, though newer generation GPU architectures may provide better performance per watt, continued performance increases can negate the gains in efficiency, and the GPUs continue to consume large amounts of power.<ref>{{cite web | url = http://www.bit-tech.net/columns/2007/05/20/performance_per_what/1 | title = Performance per What? | author = Tim Smalley | work = Bit Tech | accessdate = 2008-04-21 }}</ref>
While performance per watt is useful, absolute power requirements are also important. Claims of improved performance per watt may be used to mask increasing power demands. For instance, though newer generation GPU architectures may provide better performance per watt, continued performance increases can negate the gains in efficiency, and the GPUs continue to consume large amounts of power.<ref>{{cite web | url = http://www.bit-tech.net/columns/2007/05/20/performance_per_what/1 | title = Performance per What? | author = Tim Smalley | work = Bit Tech | access-date = 2008-04-21 }}</ref>


Benchmarks that measure power under heavy load may not adequately reflect typical efficiency. For instance, 3DMark stresses the 3D performance of a GPU, but many computers spend most of their time doing less intense display tasks (idle, 2D tasks, displaying video). So the 2D or idle efficiency of the graphics system may be at least as significant for overall energy efficiency. Likewise, systems that spend much of their time in standby or [[standby power|soft off]] are not adequately characterized by just efficiency under load. To help address this some benchmarks, like [[SPECpower]], include measurements at a series of load levels.<ref>{{cite web | url = http://blogs.zdnet.com/Ou/?p=927 | work = ZDNet | title = SPEC launches standardized energy efficiency benchmark}}</ref>
Benchmarks that measure power under heavy load may not adequately reflect typical efficiency. For instance, 3DMark stresses the 3D performance of a GPU, but many computers spend most of their time doing less intense display tasks (idle, 2D tasks, displaying video). So the 2D or idle efficiency of the graphics system may be at least as significant for overall energy efficiency. Likewise, systems that spend much of their time in standby or [[standby power|soft off]] are not adequately characterized by just efficiency under load. To help address this some benchmarks, like [[SPECpower]], include measurements at a series of load levels.<ref>{{cite web | url = http://blogs.zdnet.com/Ou/?p=927 | archive-url = https://web.archive.org/web/20071216112117/http://blogs.zdnet.com/Ou/?p=927 | url-status = dead | archive-date = 16 December 2007 | work = ZDNet | title = SPEC launches standardized energy efficiency benchmark}}</ref>


The efficiency of some electrical components, such as [[voltage regulator]]s, decreases with increasing temperature, so the power used may increase with temperature. Power supplies, motherboards, and some video cards are some of the subsystems affected by this. So their power draw may depend on temperature, and the temperature or temperature dependence should be noted when measuring.<ref>{{cite web | url = http://www.silentpcreview.com/article821-page5.html | title = Asus EN9600GT Silent Edition Graphics Card | author = Mike Chin | page = 5 | work = Silent PC Review | accessdate = 2008-04-21}}</ref><ref name="SPCRNewLevels">{{cite web | url = http://www.silentpcreview.com/article814-page1.html | title = 80 Plus expands podium for Bronze, Silver & Gold | author = MIke Chin | work = Silent PC Review | date = 19 March 2008 | accessdate = 2008-04-21 }}</ref>
The efficiency of some electrical components, such as [[voltage regulator]]s, decreases with increasing temperature, so the power used may increase with temperature. Power supplies, motherboards, and some video cards are some of the subsystems affected by this. So their power draw may depend on temperature, and the temperature or temperature dependence should be noted when measuring.<ref>{{cite web | url = http://www.silentpcreview.com/article821-page5.html | title = Asus EN9600GT Silent Edition Graphics Card | author = Mike Chin | page = 5 | work = Silent PC Review | access-date = 2008-04-21}}</ref><ref name="SPCRNewLevels">{{cite web | url = http://www.silentpcreview.com/article814-page1.html | title = 80 Plus expands podium for Bronze, Silver & Gold | author = Mike Chin | work = Silent PC Review | date = 19 March 2008 | access-date = 2008-04-21 }}</ref>


Performance per watt also typically does not include full [[Life cycle assessment|life-cycle costs]]. Since computer manufacturing is energy intensive, and computers often have a relatively short lifespan, energy and materials involved in production, distribution, [[electronic waste|disposal]] and [[computer recycling|recycling]] often make up significant portions of their cost, energy use, and environmental impact.<ref>{{cite web | url=http://www.ecopcreview.com/LCA_and_ECPR | title=Life Cycle Analysis and Eco PC Review | work=Eco PC Review | author=Mike Chin | url-status=dead | archiveurl=https://web.archive.org/web/20080304062508/http://www.ecopcreview.com/LCA_and_ECPR | archivedate=2008-03-04 }}</ref><ref>{{cite journal | journal = Environ. Sci. Technol.
Performance per watt also typically does not include full [[Life cycle assessment|life-cycle costs]]. Since computer manufacturing is energy intensive, and computers often have a relatively short lifespan, energy and materials involved in production, distribution, [[electronic waste|disposal]] and [[computer recycling|recycling]] often make up significant portions of their cost, energy use, and environmental impact.<ref>{{cite web | url=http://www.ecopcreview.com/LCA_and_ECPR | title=Life Cycle Analysis and Eco PC Review | work=Eco PC Review | author=Mike Chin | url-status=dead | archive-url=https://web.archive.org/web/20080304062508/http://www.ecopcreview.com/LCA_and_ECPR | archive-date=2008-03-04 }}</ref><ref>{{cite journal | journal = Environ. Sci. Technol.
| url = http://pubs.acs.org/cgi-bin/abstract.cgi/esthag/2004/38/i22/abs/es035152j.html
| title = Energy intensity of computer manufacturing: hybrid assessment combining process and economic input-output methods | author = Eric Williams | year=2004
| title = Energy intensity of computer manufacturing: hybrid assessment combining process and economic input-output methods | author = Eric Williams | year=2004
| doi = 10.1021/es035152j
| doi = 10.1021/es035152j
Line 174: Line 78:


==Other energy efficiency measures==
==Other energy efficiency measures==
SWaP (space, wattage and performance) is a [[Sun Microsystems]] metric for [[data center]]s, incorporating energy and space:
SWaP (space, wattage and performance) is a [[Sun Microsystems]] metric for [[data center]]s, incorporating power and space:


:<math>\mathrm{SWaP} = \frac{\mathrm{Performance}}{\mathrm{Space} \cdot \mathrm{Power}}</math>
:<math>\mathrm{SWaP} = \frac{\mathrm{Performance}}{\mathrm{Space} \cdot \mathrm{Power}}</math>


Where performance is measured by any appropriate benchmark, and space is size of the computer.<ref>{{cite web|last=Greenhill|first=David|title=SWaP Space Watts and Power|url=http://www.energystar.gov/ia/products/downloads/Greenhill_Pres.pdf|work=US EPA Energystar|accessdate=14 November 2013}}</ref>
Where performance is measured by any appropriate benchmark, and space is size of the computer.<ref>{{cite web|last=Greenhill|first=David|title=SWaP Space Watts and Power|url=http://www.energystar.gov/ia/products/downloads/Greenhill_Pres.pdf|work=US EPA Energystar|access-date=14 November 2013}}</ref>

Reduction of power, mass, and volume is also important for spaceflight computers.<ref name="sc-7" />


==See also==
==See also==
Line 189: Line 95:
* [[Data center infrastructure efficiency]] (DCIE)
* [[Data center infrastructure efficiency]] (DCIE)
* [[Energy proportional computing]]
* [[Energy proportional computing]]
* [[GeForce 9 series]]{{snd}} for GPU list, with energy use and theoretical FLOPS
* [[IT energy management]]
* [[IT energy management]]
* [[Koomey's law]]
* [[Koomey's law]]
Line 195: Line 100:
* [[Low-power electronics]]
* [[Low-power electronics]]
* [[Power usage effectiveness]] (PUE)
* [[Power usage effectiveness]] (PUE)
* [[Processor power dissipation]]


==Notes and references==
==Notes and references==
Line 220: Line 126:


==External links==
==External links==
* [https://www.top500.org/lists/green500/ The Green500]
* [http://www.eweek.com/c/a/Green-IT/Top-25-Most-Energy-Efficient-Supercomputers/ 25 Energy Efficient Supercomputers]<!--maybe too outdated, but I guess people will notice it's from 2008-->
* [https://www.top500.org/lists/green500/ The Green 500 Lists]


{{CPU technologies}}
{{CPU technologies}}

Latest revision as of 18:33, 11 September 2024

In computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power consumed. This rate is typically measured by performance on the LINPACK benchmark when trying to compare between computing systems: an example using this is the Green500 list of supercomputers. Performance per watt has been suggested to be a more sustainable measure of computing than Moore's Law.[1]

System designers building parallel computers, such as Google's hardware, pick CPUs based on their performance per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself.[2]

Spaceflight computers have hard limits on the maximum power available and also have hard requirements on minimum real-time performance. A ratio of processing speed to required electrical power is more useful than raw processing speed.[3]

Definition

[edit]

The performance and power consumption metrics used depend on the definition; reasonable measures of performance are FLOPS, MIPS, or the score for any performance benchmark. Several measures of power usage may be employed, depending on the purposes of the metric; for example, a metric might only consider the electrical power delivered to a machine directly, while another might include all power necessary to run a computer, such as cooling and monitoring systems. The power measurement is often the average power used while running the benchmark, but other measures of power usage may be employed (e.g. peak power, idle power).

For example, the early UNIVAC I computer performed approximately 0.015 operations per watt-second (performing 1,905 operations per second (OPS), while consuming 125 kW). The Fujitsu FR-V VLIW/vector processor system on a chip in the 4 FR550 core variant released 2005 performs 51 Giga-OPS with 3 watts of power consumption resulting in 17 billion operations per watt-second.[4][5] This is an improvement by over a trillion times in 54 years.

Most of the power a computer uses is converted into heat, so a system that takes fewer watts to do a job will require less cooling to maintain a given operating temperature. Reduced cooling demands makes it easier to quiet a computer. Lower energy consumption can also make it less costly to run, and reduce the environmental impact of powering the computer (see green computing). If installed where there is limited climate control, a lower power computer will operate at a lower temperature, which may make it more reliable. In a climate controlled environment, reductions in direct power use may also create savings in climate control energy.

Computing energy consumption is sometimes also measured by reporting the energy required to run a particular benchmark, for instance EEMBC EnergyBench. Energy consumption figures for a standard workload may make it easier to judge the effect of an improvement in energy efficiency.

When performance is defined as operations/second, then performance per watt can be written as operations/watt-second. Since a watt is one joule/second, then performance per watt can also be written as operations/joule.

FLOPS per watt

[edit]

Exponential growth of supercomputer performance per watt based on data from the Green500 list. The red crosses denote the most power efficient computer, while the blue ones denote the computer ranked#500.

FLOPS per watt is a common measure. Like the FLOPS (Floating Point Operations Per Second) metric it is based on, the metric is usually applied to scientific computing and simulations involving many floating point calculations.

Examples

[edit]

As of June 2016, the Green500 list rates the two most efficient supercomputers highest – those are both based on the same manycore accelerator PEZY-SCnp Japanese technology in addition to Intel Xeon processors – both at RIKEN, the top one at 6673.8 MFLOPS/watt; and the third ranked is the Chinese-technology Sunway TaihuLight (a much bigger machine, that is the ranked 2nd on TOP500, the others are not on that list) at 6051.3 MFLOPS/watt.[6]

In June 2012, the Green500 list rated BlueGene/Q, Power BQC 16C as the most efficient supercomputer on the TOP500 in terms of FLOPS per watt, running at 2,100.88 MFLOPS/watt.[7]

In November 2010, IBM machine, Blue Gene/Q achieves 1,684 MFLOPS/watt.[8][9]

On 9 June 2008, CNN reported that IBM's Roadrunner supercomputer achieves 376 MFLOPS/watt.[10][11]

As part of the Intel Tera-Scale research project, the team produced an 80-core CPU that can achieve over 16,000 MFLOPS/watt.[12][13] The future of that CPU is not certain.

Microwulf, a low cost desktop Beowulf cluster of four dual-core Athlon 64 X2 3800+ computers, runs at 58 MFLOPS/watt.[14]

Kalray has developed a 256-core VLIW CPU that achieves 25,000 MFLOPS/watt. Next generation is expected to achieve 75,000 MFLOPS/watt.[15] However, in 2019 their latest chip for embedded is 80-core and claims up to 4 TFLOPS at 20 W.[16]

Adapteva announced the Epiphany V, a 1024-core 64-bit RISC processor intended to achieve 75 GFLOPS/watt,[17][18] while they later announced that the Epiphany V was "unlikely" to become available as a commercial product

US Patent 10,020,436, July 2018 claims three intervals of 100, 300, and 600 GFLOPS/watt.

GPU efficiency

[edit]

Graphics processing units (GPU) have continued to increase in energy usage, while CPUs designers have recently[when?] focused on improving performance per watt. High performance GPUs may draw large amount of power, therefore intelligent techniques are required to manage GPU power consumption. Measures like 3DMark2006 score per watt can help identify more efficient GPUs.[19] However that may not adequately incorporate efficiency in typical use, where much time is spent doing less demanding tasks.[20]

With modern GPUs, energy usage is an important constraint on the maximum computational capabilities that can be achieved. GPU designs are usually highly scalable, allowing the manufacturer to put multiple chips on the same video card, or to use multiple video cards that work in parallel. Peak performance of any system is essentially limited by the amount of power it can draw and the amount of heat it can dissipate. Consequently, performance per watt of a GPU design translates directly into peak performance of a system that uses that design.

Since GPUs may also be used for some general purpose computation, sometimes their performance is measured in terms also applied to CPUs, such as FLOPS per watt.

Challenges

[edit]

While performance per watt is useful, absolute power requirements are also important. Claims of improved performance per watt may be used to mask increasing power demands. For instance, though newer generation GPU architectures may provide better performance per watt, continued performance increases can negate the gains in efficiency, and the GPUs continue to consume large amounts of power.[22]

Benchmarks that measure power under heavy load may not adequately reflect typical efficiency. For instance, 3DMark stresses the 3D performance of a GPU, but many computers spend most of their time doing less intense display tasks (idle, 2D tasks, displaying video). So the 2D or idle efficiency of the graphics system may be at least as significant for overall energy efficiency. Likewise, systems that spend much of their time in standby or soft off are not adequately characterized by just efficiency under load. To help address this some benchmarks, like SPECpower, include measurements at a series of load levels.[23]

The efficiency of some electrical components, such as voltage regulators, decreases with increasing temperature, so the power used may increase with temperature. Power supplies, motherboards, and some video cards are some of the subsystems affected by this. So their power draw may depend on temperature, and the temperature or temperature dependence should be noted when measuring.[24][25]

Performance per watt also typically does not include full life-cycle costs. Since computer manufacturing is energy intensive, and computers often have a relatively short lifespan, energy and materials involved in production, distribution, disposal and recycling often make up significant portions of their cost, energy use, and environmental impact.[26][27]

Energy required for climate control of the computer's surroundings is often not counted in the wattage calculation, but it can be significant.[28]

Other energy efficiency measures

[edit]

SWaP (space, wattage and performance) is a Sun Microsystems metric for data centers, incorporating power and space:

Where performance is measured by any appropriate benchmark, and space is size of the computer.[29]

Reduction of power, mass, and volume is also important for spaceflight computers.[3]

See also

[edit]
Energy efficiency benchmarks
  • Average CPU power (ACP) – a measure of power consumption when running several standard benchmarks
  • EEMBC – EnergyBench
  • SPECpower – a benchmark for web servers running Java (Server Side Java Operations per Joule)
Other

Notes and references

[edit]
  1. ^ Aitken, Rob; Fellow; Technology, Director of; Arm (12 July 2021). "Performance per Watt Is the New Moore's Law". Arm Blueprint. Retrieved 16 July 2021.
  2. ^ Power could cost more than servers, Google warns, CNET, 2006
  3. ^ a b D. J. Shirley; and M. K. McLelland. "The Next-Generation SC-7 RISC Spaceflight Computer". p. 1, 2.
  4. ^ "Fujitsu Develops Multi-core Processor for High-Performance Digital Consumer Products" (Press release). Fujitsu. 7 February 2020. Archived from the original on 25 March 2019. Retrieved 8 August 2020.
  5. ^ FR-V Single-Chip Multicore Processor:FR1000 Archived 2015-04-02 at the Wayback Machine Fujitsu
  6. ^ "Green500 List for June 2016".
  7. ^ "The Green500 List". Green500. Archived from the original on 3 July 2012.
  8. ^ "Top500 Supercomputing List Reveals Computing Trends". 20 July 2010. IBM... BlueGene/Q system .. setting a record in power efficiency with a value of 1,680 Mflops/watt, more than twice that of the next best system.
  9. ^ "IBM Research A Clear Winner in Green 500". 18 November 2010.
  10. ^ "Government unveils world's fastest computer". CNN. Archived from the original on 10 June 2008. performing 376 million calculations for every watt of electricity used.
  11. ^ "IBM Roadrunner Takes the Gold in the Petaflop Race". Archived from the original on 13 June 2008.
  12. ^ "Intel squeezes 1.8 TFlops out of one processor". TG Daily. Archived from the original on 3 December 2007.
  13. ^ "Teraflops Research Chip". Intel Technology and Research.
  14. ^ Joel Adams. "Microwulf: Power Efficiency". Microwulf: A Personal, Portable Beowulf Cluster.
  15. ^ "MPPA MANYCORE - Many-core processors - KALRAY - Agile Performance".
  16. ^ "Kalray announces the Tape-Out of Coolidge on TSMC 16NM process technology". Kalray. 31 July 2019. Retrieved 12 August 2019.
  17. ^ Olofsson, Andreas. "Epiphany-V: A 1024-core 64-bit RISC processor". Retrieved 6 October 2016.
  18. ^ Olofsson, Andreas. "Epiphany-V: A 1024 processor 64-bit RISC System-On-Chip" (PDF). Retrieved 6 October 2016.
  19. ^ Atwood, Jeff (18 August 2006). "Video Card Power Consumption". Archived from the original on 8 September 2008. Retrieved 26 March 2008.
  20. ^ "Video card power consumption". Xbit Labs. Archived from the original on 4 September 2011.
  21. ^ "PSA: Performance Doesn't Scale Linearly with Wattage (Aka testing M1 versus a Zen 3 5600X at the same Power Draw)". 29 November 2020.
  22. ^ Tim Smalley. "Performance per What?". Bit Tech. Retrieved 21 April 2008.
  23. ^ "SPEC launches standardized energy efficiency benchmark". ZDNet. Archived from the original on 16 December 2007.
  24. ^ Mike Chin. "Asus EN9600GT Silent Edition Graphics Card". Silent PC Review. p. 5. Retrieved 21 April 2008.
  25. ^ Mike Chin (19 March 2008). "80 Plus expands podium for Bronze, Silver & Gold". Silent PC Review. Retrieved 21 April 2008.
  26. ^ Mike Chin. "Life Cycle Analysis and Eco PC Review". Eco PC Review. Archived from the original on 4 March 2008.
  27. ^ Eric Williams (2004). "Energy intensity of computer manufacturing: hybrid assessment combining process and economic input-output methods". Environ. Sci. Technol. 38 (22): 6166–74. Bibcode:2004EnST...38.6166W. doi:10.1021/es035152j. PMID 15573621.
  28. ^ Wu-chun Feng (2005). "The Importance of Being Low Power in High Performance Computing". CT Watch Quarterly. 1 (5).
  29. ^ Greenhill, David. "SWaP Space Watts and Power" (PDF). US EPA Energystar. Retrieved 14 November 2013.

Further reading

[edit]
[edit]