Jump to content

Ampere (microarchitecture): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
No edit summary
Line 52: Line 52:


==External links==
==External links==
* [https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf NVIDIA A100 Tensor Core GPU Architecture whitepaper]
* [https://www.nvidia.com/en-us/data-center/nvidia-ampere-gpu-architecture/ Nvidia Ampere Architecture]
* [https://www.nvidia.com/en-us/data-center/nvidia-ampere-gpu-architecture/ Nvidia Ampere Architecture]
* [https://www.nvidia.com/en-us/data-center/a100/ Nvidia A100 Tensor Core GPU]
* [https://www.nvidia.com/en-us/data-center/a100/ Nvidia A100 Tensor Core GPU]

Revision as of 17:31, 19 May 2020

Nvidia Ampere
Fabrication processTSMC 7 nm (FinFET)
History
Predecessor
SuccessorHopper

Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to the Volta architecture, officially announced on May 14, 2020. It is named after French mathematician and physicist André-Marie Ampère.[1][2] Nvidia's announcement focused on professional AI use, data centers, and Nvidia Drive (self-driving automobiles, autopilot, etc.) and it was not clear at that time whether Ampere would be seen in consumer GPUs as the successor to Turing.[3]

Details

Architectural improvements of the Ampere architecture include the following:

A100 accelerator and DGX A100

Announced and released on May 14, 2020 was the Ampere-based A100 accelerator.[4] The A100 features 19.5 teraflops of FP32 performance, 6912 CUDA cores, 40GB of graphics memory, and 1.6TB/s of graphics memory bandwidth.[3] The A100 accelerator was initially available only in the 3rd generation of DGX server, including 8 A100s.[4] Also included in the DGX A100 is 15TB of PCIe gen 4 NVMe storage,[3] two 64-core AMD Rome 7742 CPUs, 1 TB of RAM, and Mellanox-powered HDR InfiniBand interconnect. The initial price for the DGX A100 was $199,000.[4]

Comparison of accelerators used in DGX:[4]

Accelerator
A100​
V100​
P100
Architecture FP32 CUDA Cores Boost Clock Memory Clock Memory Bus Width Memory Bandwidth VRAM Single Precision Double Precision INT8 Tensor FP16 Tensor bfloat16 Tensor TensorFloat-32(TF32) Tensor FP64 Tensor Interconnect GPU GPU Die Size Transistor Count TDP Manufacturing Process
Ampere 6912 ~1410MHz 2.4Gbps HBM2 5120-bit 1555GB/sec 40GB 19.5 TFLOPs 9.7 TFLOPs 624 TFLOPs 312 TFLOPs 312 TFLOPs 156 TFLOPs 19.5 TFLOPS 600GB/sec GA100 826mm2 54.2B 400W TSMC 7nm N7
Volta 5120 1530MHz 1.75Gbps HBM2 4096-bit 900GB/sec 16GB/32GB 15.7 TFLOPs 7.8 TFLOPs N/A 125 TFLOPs N/A N/A N/A 300GB/sec GV100 815mm2 21.1B 300W/350W TSMC 12nm FFN
Pascal 3584 1480MHz 1.4Gbps HBM2 4096-bit 720GB/sec 16GB 10.6 TFLOPs 5.3 TFLOPs N/A N/A N/A N/A N/A 160GB/sec GP100 610mm2 15.3B 300W TSMC 16nm FinFET

References

  1. ^ [https://nvidianews.nvidia.com/news/nvidias-new-ampere-data-center-gpu-in-full-production bare url
  2. ^ [https://devblogs.nvidia.com/nvidia-ampere-architecture-in-depth/ bare url
  3. ^ a b c Tom Warren; James Vincent (May 14, 2020). "Nvidia's first Ampere GPU is designed for data centers and AI, not your PC". The Verge.
  4. ^ a b c d e f Ryan Smith (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech.