Jump to content

IBM Blue Gene

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 167.75.254.253 (talk) at 18:09, 14 June 2011. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A Blue Gene/P supercomputer at Argonne National Laboratory

I wish I could get on pornhub.com. The project was awarded the National Medal of Technology and Innovation by U.S. President Barack Obama on September 18, 2009. The president bestowed the award on October 7, 2009.[1]

Blue Gene/L

A Blue Gene/L cabinet

The first computer in the Blue Gene series, Blue Gene/L, developed through a partnership with Lawrence Livermore National Laboratory (LLNL), originally had a theoretical peak performance of 360 TFLOPS, and scored over 280 TFLOPS sustained on the LINPACK benchmark. After an upgrade in 2007 the performance increased to 478 TFLOPS sustained and 596 TFLOPS peak.

The term Blue Gene/L sometimes refers to the computer installed at LLNL; and sometimes refers to the architecture of that computer. As of November 2006, there are 27 computers on the TOP500 list using the Blue Gene/L architecture. All these computers are listed as having an architecture of eServer Blue Gene Solution.

The block scheme of the Blue Gene/L ASIC including dual PowerPC 440 cores.

In December 1999, IBM announced a $100 million research initiative for a five-year effort to build a massively parallel computer, to be applied to the study of biomolecular phenomena such as protein folding. The project has two main goals: to advance our understanding of the mechanisms behind protein folding via large-scale simulation, and to explore novel ideas in massively parallel machine architecture and software. This project should enable biomolecular simulations that are orders of magnitude larger than current technology permits. Major areas of investigation include: how to use this novel platform to effectively meet its scientific goals, how to make such massively parallel machines more usable, and how to achieve performance targets at a reasonable cost, through novel machine architectures. The design is built largely around the previous QCDSP and QCDOC supercomputers.

In November 2001, Lawrence Livermore National Laboratory joined IBM as a research partner for Blue Gene.

On September 29, 2004, IBM announced that a Blue Gene/L prototype at IBM Rochester (Minnesota) had overtaken NEC's Earth Simulator as the fastest computer in the world, with a speed of 36.01 TFLOPS on the LINPACK benchmark, beating Earth Simulator's 35.86 TFLOPS. This was achieved with an 8-cabinet system, with each cabinet holding 1,024 compute nodes. Upon doubling this configuration to 16 cabinets, the machine reached a speed of 70.72 TFLOPS by November 2004 , taking first place in the TOP500 list.

On March 24, 2005, the US Department of Energy announced that the Blue Gene/L installation at LLNL broke its speed record, reaching 135.5 TFLOPS. This feat was possible because of doubling the number of cabinets to 32.

On the June 2006 TOP500 list,[2] Blue Gene/L installations across several sites worldwide took 3 out of the 10 top positions, and 13 out of the top 64. Three racks of Blue Gene/L are housed at the San Diego Supercomputer Center and are available for academic research. New York Blue/L, ranked 17th in the June 2008 TOP500 list,[3] also provides time allocations when requested.

On October 27, 2005, LLNL and IBM announced that Blue Gene/L had once again broken its speed record, reaching 280.6 TFLOPS on LINPACK, upon reaching its final configuration of 65,536 "compute nodes" (i.e., 216 nodes) and an additional 1024 "I/O nodes" in 64 air-cooled cabinets. The LLNL Blue Gene/L uses Lustre to access multiple filesystems in the 600TB-1PB range.[4]

Blue Gene/L is also the first supercomputer ever to run over 100 TFLOPS sustained on a real world application, namely a three-dimensional molecular dynamics code (ddcMD), simulating solidification (nucleation and growth processes) of molten metal under high pressure and temperature conditions. This achievement won the 2005 Gordon Bell Prize.

On June 22, 2006, NNSA and IBM announced that Blue Gene/L has achieved 207.3 TFLOPS on a quantum chemical application (Qbox).[5] On November 14, 2006, at Supercomputing 2006,[6] Blue Gene/L was awarded the winning prize in all HPC Challenge Classes of awards.[7] A team from the IBM Almaden Research Center and the University of Nevada on April 27, 2007 ran an artificial neural network almost half as complex as the brain of a mouse for the equivalent of a second (the network was run at 1/10 of normal speed for 10 seconds).[8]

In November 2007, the LLNL Blue Gene/L remained at the number one spot as the world's fastest supercomputer. It had been upgraded since the previous measurement, and was then almost three times as fast as the second fastest, a Blue Gene/P system.

On June 18, 2008, the new TOP500 List marked the first time a Blue Gene system was not the leader in the TOP500 since it had assumed that position, being topped by IBM's Cell-based Roadrunner system, which was at the time the only system to surpass the petaflops mark.

Major features

The Blue Gene/L supercomputer is unique in the following aspects:

  • Trading the speed of processors for lower power consumption.
  • Dual processors per node with two working modes: co-processor (1 user process/node: computation and communication work is shared by two processors) and virtual node (2 user processes/node)
  • System-on-a-chip design
  • A large number of nodes (scalable in increments of 1024 up to at least 65,536)
  • Three-dimensional torus interconnect with auxiliary networks for global communications, I/O, and management
  • Lightweight OS per node for minimum system overhead (computational noise)[9]

Architecture

One Blue Gene/L node board
A schematic overview of a Blue Gene/L supercomputer

Each Compute or I/O node is a single ASIC with associated DRAM memory chips. The ASIC integrates two 700 MHz PowerPC 440 embedded processors, each with a double-pipeline-double-precision Floating Point Unit (FPU), a cache sub-system with built-in DRAM controller and the logic to support multiple communication sub-systems. The dual FPUs give each Blue Gene/L node a theoretical peak performance of 5.6 GFLOPS (gigaFLOPS). Node CPUs are not cache coherent with one another.

Compute nodes are packaged two per compute card, with 16 compute cards plus up to 2 I/O nodes per node board. There are 32 node boards per cabinet/rack.[10] By integration of all essential sub-systems on a single chip, each Compute or I/O node dissipates low power (about 17 watts, including DRAMs). This allows very aggressive packaging of up to 1024 compute nodes plus additional I/O nodes in the standard 19" cabinet, within reasonable limits of electrical power supply and air cooling. The performance metrics in terms of FLOPS per watt, FLOPS per m2 of floorspace and FLOPS per unit cost allow scaling up to very high performance. With so many nodes, component failures are inevitable. The system is able to electrically isolate faulty hardware to allow the machine to continue to run.

Each Blue Gene/L node is attached to three parallel communications networks: a 3D toroidal network for peer-to-peer communication between compute nodes, a collective network for collective communication, and a global interrupt network for fast barriers. The I/O nodes, which run the Linux operating system, provide communication with the world via an Ethernet network. The I/O nodes also handle the filesystem operations on behalf of the compute nodes. Finally, a separate and private Ethernet network provides access to any node for configuration, booting and diagnostics. To allow multiple programs to run concurrently, a Blue Gene/L system can be partitioned into electronically isolated sets of nodes. The number of nodes in a partition must be a positive integer power of 2, and must contain at least 25 = 32 nodes. The maximum partition is all nodes in the computer. To run a program on Blue Gene/L, a partition of the computer must first be reserved. The program is then run on all the nodes within the partition, and no other program may access nodes within the partition while it is in use. Upon completion, the partition nodes are released for future programs to use.

Blue Gene/L compute nodes use a minimal operating system supporting a single user program. Only a subset of POSIX calls are supported, and only one process may be run at a time. Programmers need to implement green threads in order to simulate local concurrency. Application development is usually performed in C, C++, or Fortran using MPI for communication. However, some scripting languages such as Ruby[11] and Python[12] have been ported to the compute nodes.

Plan 9 support

A team composed of members from Bell-Labs, IBM Research, Sandia National Laboratories, and Vita Nuova have completed a port of Plan 9 to Blue Gene/L and Blue Gene/P. Plan 9 kernels are running on both the compute nodes and the I/O nodes. The Ethernet, Torus, Collective Network, Barrier Network, and Management networks are all supported.[13][14]

Cyclops64 (Blue Gene/C)

Blue Gene/C (now renamed to Cyclops64) is a sister-project to Blue Gene/L. It is a massively parallel, supercomputer-on-a-chip cellular architecture. It was slated for release in early 2007 but has been delayed.

Blue Gene/P

A Blue Gene/P node card
A schematic overview of a Blue Gene/P supercomputer

On June 26, 2007, IBM unveiled Blue Gene/P, the second generation of the Blue Gene supercomputer and designed through a collaboration that included IBM, LLNL, and Argonne National Laboratory's Leadership Computing Facility. Designed to run continuously at 1 PFLOPS (petaFLOPS), it can be configured to reach speeds in excess of 3 PFLOPS. Furthermore, it is at least seven times more energy efficient than any other supercomputer, accomplished by using many small, low-power chips connected through five specialized networks. Four 850 MHz PowerPC 450 processors are integrated on each Blue Gene/P chip. The 1-PFLOPS Blue Gene/P configuration is a 294,912-processor, 72-rack system harnessed to a high-speed, optical network. Blue Gene/P can be scaled to an 884,736-processor, 216-rack cluster to achieve 3-PFLOPS performance. A standard Blue Gene/P configuration will house 4,096 processors per rack.[15]

On November 12, 2007, the first system, JUGENE, with 65,536 processors is running at Forschungszentrum Jülich in Germany with a performance of 167 TFLOPS.[16] When inaugurated it was the fastest supercomputer in Europe and the sixth fastest in the world. The first laboratory in the United States to receive the Blue Gene/P was Argonne National Laboratory. The first racks of the Blue Gene/P shipped in fall 2007. The first installment was a 111-teraflops system, which has approximately 32,000 processors, and was operational for the US research community in spring 2008.[17] The full Intrepid system is ranked #3 on the June 2008 Top 500 list.[18] Another Blue Gene/P has been installed on September 9, 2008 in Sofia, the capital of Bulgaria, and is operated by the Bulgarian Academy of Sciences and the Sofia University.[19] In 2010, a Blue Gene/P was installed at the University of Melbourne for the Victorian Life Sciences Computational Initiative.[20]

In February 2009 it was announced that JUGENE will be upgraded to reach petaflops performance in June 2009, making it the first petascale supercomputer in Europe. The new configuration has started at April 6, the system will go into production end of June 2009. The new configuration will include 294 912 processor cores, 144 terabyte memory, 6 petabyte storage in 72 racks. The new configuaration will incorporate a new water cooling system that will reduce the cooling cost substantially.[21][22][23]

Veselin Topalov, the challenger to the World Chess Champion title in 2010, confirmed in an interview that he had used a Blue Gene/P supercomputer during his preparation for the match.[24] The Blue Gene/P computer has been used to simulate approximately one percent of a human cerebral cortex, containing 1.6 billion neurons with approximately 9 trillion connections.[25]

Web-scale platform

The IBM Kittyhawk project team has ported Linux to the compute nodes and demonstrated generic Web 2.0 workloads running at scale on a Blue Gene/P. Their paper published in the ACM Operating Systems Review describes a kernel driver that tunnels Ethernet over the tree network, which results in all-to-all TCP/IP connectivity.[26] Running standard Linux software like MySQL, their performance results on SpecJBB rank among the highest on record.[citation needed]

Blue Gene/Q

The last known supercomputer design in the Blue Gene series, Blue Gene/Q was aimed to reach 20 Petaflops in the 2011 time frame, but has slipped to 2012. It continues to expand and enhance the Blue Gene/L and /P architectures with higher frequency at much improved performance per watt (1684 MFLOPS/Watt[27][28]). Blue Gene/Q has a similar number of nodes but many more cores per node.[29]

Design

  • The Blue Gene/Q is a 4-way hyperthreaded 64-bit PowerPC A2 based chip with 16 cores. The chips will have integrated memory, I/O controllers and be mounted on a compute node card which also has 1 GB DDR3 RAM for each processor core.[30][31]
  • A compute drawer will have 32 compute cards, each water cooled and connected with fiber optics for the 5D network torus.[30]
  • Each I/O drawer will be air cooled and contain 8 compute cards and 8 PCIe expansion slots for Infiniband or 10 Gigabit Ethernet networking.[30]
  • Racks will have 32 compute drawers for a total of 1024 compute nodes, 16,384 cores and 16 TB RAM.[30]

Installations

The archetypal Blue Gene/Q system called Sequoia will be installed at Lawrence Livermore National Laboratory in 2012 as a part of the Advanced Simulation and Computing Program running nuclear simulations and advanced scientific research. It will consist of 98,304 compute nodes comprising 1.6 million processor cores and 1.6 PB memory in 96 racks covering an area of about 3,000 square feet (280 m2), drawing 6 megawatts of power.[32]

A Blue Gene/Q system called Mira will be installed at Argonne National Laboratory in the Argonne Leadership Computing Facility early in 2012. It will consist of 49,152 compute nodes, with 70 PB of disk storage (470 GB/s I/O bandwidth).[33][34]

A single midplane (8192 cores) of a Blue Gene/Q prototype at IBM Watson was #115 on the November 2010 Top 500 list,[35] with more than 100 teraflops peak performance and LINPACK performance of 65 teraflops.

See also

References

  1. ^ Harris, Mark (September 18, 2009). "Obama honours IBM supercomputer". Techradar. Retrieved 2009-09-18.
  2. ^ TOP500 list - June 2006
  3. ^ TOP500 list - June 2008
  4. ^ LLNL Parallel File Systems Tutorial
  5. ^ hpcwire.com
  6. ^ SC06
  7. ^ hpcchallenge.org
  8. ^ bbc.co.uk
  9. ^ Knight, Will: "IBM creates world's most powerful computer", NewScientist.com news service, June 2007
  10. ^ Bluegene/L Configuration https://asc.llnl.gov/computing_resources/bluegenel/configuration.html
  11. ^ ece.iastate.edu
  12. ^ William Scullin (March 12, 2011). Python for High Performance Computing. Atlanta, GA.
  13. ^ research.ibm.com
  14. ^ usenix.org
  15. ^ ibm.com
  16. ^ "US commissions beefy IBM supercomputer". IDG News Service. 2007-11-12.
  17. ^ Curry, Jessica (2007-08-12). "Blue Gene Baby". Chicago Life.
  18. ^ "Argonne's Supercomputer Named World’s Fastest for Open Science, Third Overall"
  19. ^ Вече си имаме и суперкомпютър, Dir.bg, 9 September 2008
  20. ^ "IBM Press room - 2010-02-11 IBM to Collaborate with Leading Australian Institutions to Push the Boundaries of Medical Research - Australia". 03.ibm.com. 2010-02-11. Retrieved 2011-03-11.
  21. ^ [1][dead link]
  22. ^ [2][dead link]
  23. ^ "IBM Press room - 2009-02-10 New IBM Petaflop Supercomputer at German Forschungszentrum Juelich to Be Europe's Most Powerful - United States". 03.ibm.com. 2009-02-10. Retrieved 2011-03-11.
  24. ^ "Topalov training with super computer Blue Gene P". Chessdom. Retrieved 21 May 2010.
  25. ^ Kaku, Michio. Physics of the Future (New York: Doubleday, 2011), 91.
  26. ^ Project Kittyhawk: building a global-scale computer
  27. ^ "Top500 Supercomputing List Reveals Computing Trends". IBM... BlueGene/Q system .. setting a record in power efficiency with a value of 1,680 Mflops/watt, more than twice that of the next best system.
  28. ^ "IBM Research A Clear Winner in Green 500".
  29. ^ cse.scitech.ac.uk
  30. ^ a b c d "IBM uncloaks 20 petaflops BlueGene/Q super". The Register. 2010-11-22. Retrieved 2010-11-25.
  31. ^ Joab Jackson (2011-02-08). "US commissions beefy IBM supercomputer". IDG News Service.
  32. ^ Feldman, Michael (2009-02-03). "Lawrence Livermore Prepares for 20 Petaflop Blue Gene/Q". HPCwire. Retrieved 2011-03-11.
  33. ^ http://www.er.doe.gov/ascr/ASCAC/Meetings/Nov09/Nov09Minutes.pdf
  34. ^ http://workshops.alcf.anl.gov/gs10/files/2010/01/betsy_riley.pdf
  35. ^ Chuck Seitz. "NNSA/SC Blue Gene/Q Prototype | TOP500 Supercomputing Sites". Top500.org. Retrieved 2011-03-11.