GeForce 700 series
Codename | GF117, GF119, GK104, GK106, GK107, GK110, GK208, GM107 |
---|---|
Models | GeForce Series
|
Transistors |
|
Cards | |
Mid-range | GeForce GTX 750 GeForce GTX 760 |
High-end | GeForce GTX 770 GeForce GTX 780 |
Enthusiast | GeForce GTX 780 Ti GeForce GTX TITAN GeForce GTX TITAN BLACK |
API support | |
DirectX | Direct3D 11.0 Shader Model 5.0 |
OpenCL | OpenCL 1.1 |
OpenGL | OpenGL 4.4 |
History | |
Predecessor | GeForce 600 Series |
Successor | GeForce 800 Series |
The GeForce 700 Series is a family of graphics processing units developed by Nvidia, used in desktop and laptop PCs. It is based on a refresh of the Kepler architecture (GK-codenamed chips), used in the previous GeForce 600 Series. A number of GeForce 700 series chips were released for mobile devices in April 2013. GeForce 700 series cards were first released in May 2013, starting with the release of the GeForce GTX TITAN on February 19, 2013 & GeForce GTX 780 on May 23, 2013.
Overview
GK110 has been designed and is being marketed with compute performance in mind. It contains 7.1 billion transistors. This model also attempts to maximise energy efficiency through the performance of as many tasks as possible in parallel according to the capabilities of its streaming processors.
With GK110, increases in space and bandwidth for both the register file and the L2 cache over previous models, are seen. At the SMX level, GK110 register file space has increased to 256KB composed of 65K 32bit registers, as compared to Fermi. As for the L2 cache, GK110 L2 cache space increased by up to 1.5MB, twice as big as GF110. Both the L2 cache and register file bandwidth have also doubled. Performance in register-starved scenarios is also improved as there are more registers available to each thread. This goes in hand with an increase of total number of registers each thread can address, moving from 63 registers per thread to 255 registers per thread with GK110.
With GK110, Nvidia also reworked the GPU texture cache to be used for compute. With 48KB in size, in compute the texture cache becomes a read-only cache, specializing in unaligned memory access workloads. Furthermore error detection capabilities have been added to make it safer for use with workloads that rely on ECC.[1]
Architecture
The GeForce 700 Series contains features from both GK104 and GK110. Kepler based members of the 700 series add the following standard features to the GeForce family.
Derive from GK104 :
- PCI Express 3.0 interface
- DisplayPort 1.2
- HDMI 1.4a 4K x 2K video output
- Purevideo VP5 hardware video acceleration (up to 4K x 2K H.264 decode)
- Hardware H.264 encoding acceleration block (NVENC)
- Support for up to 4 independent 2D displays, or 3 stereoscopic/3D displays (NV Surround)
- Bindless Textures
- GPU Boost
- TXAA
- Manufactured by TSMC on a 28 nm process
New Features from GK110 :
- Compute Focus SMX Improvement
- CUDA Compute Capability 3.5
- New Shuffle Instructions
- Dynamic Parallelism
- Hyper-Q (Hyper-Q's MPI functionality reserve for Tesla only)
- Grid Management Unit
- NVIDIA GPUDirect (GPU Direct’s RDMA functionality reserve for Tesla only)
Compute Focus SMX Improvement
With GK110, Nvidia opted to increase compute performance. The single biggest change from GK104 is that rather than 8 dedicated FP64 CUDA cores, GK110 has up to 64, giving it 8x the FP64 throughput of a GK104 SMX. The SMX also sees an increase in space for register file. Register file space has increased to 256KB compared to Fermi. The texture cache are also improved. With a 48KB space, the texture cache can become a read-only cache for compute workloads.[1]
New Shuffle Instructions
At a low level, GK110 sees an additional instructions and operations to further improve performance. New shuffle instructions allow for threads within a warp to share data without going back to memory, making the process much quicker than the previous load/share/store method. Atomic operations are also overhauled, speeding up the execution speed of atomic operations and adding some FP64 operations that were previously only available for FP32 data.[1]
Hyper-Q
Hyper-Q expands GK110 hardware work queues from 1 to 32. The significance of this being that having a single work queue meant that Fermi could be under occupied at times as there wasn’t enough work in that queue to fill every SM. By having 32 work queues, GK110 can in many scenarios, achieve higher utilization by being able to put different task streams on what would otherwise be an idle SMX. The simple nature of Hyper-Q is further reinforced by the fact that it’s easily map to MPI, a common message passing interface frequently used in HPC. As legacy MPI-based algorithms that were originally designed for multi-CPU systems that became bottlenecked by false dependencies now have a solution. By increasing the number of MPI jobs, it’s possible to utilize Hyper-Q on these algorithms to improve the efficiency all without changing the code itself.[1]
Dynamic Parallelism
Dynamic Parallelism ability is for kernels to be able to dispatch other kernels. With Fermi, only the CPU could dispatch a kernel, which incurs a certain amount of overhead by having to communicate back to the CPU. By giving kernels the ability to dispatch their own child kernels, GK110 can both save time by not having to go back to the CPU, and in the process free up the CPU to work on other tasks.[1]
Grid Management Unit
Enabling Dynamic Parallelism requires a new grid management and dispatch control system. The new Grid Management Unit (GMU) manages and prioritizes grids to be executed. The GMU can pause the dispatch of new grids and queue pending and suspended grids until they are ready to execute, providing the flexibility to enable powerful runtimes, such as Dynamic Parallelism. The CUDA Work Distributor in Kepler holds grids that are ready to dispatch, and is able to dispatch 32 active grids, which is double the capacity of the Fermi CWD. The Kepler CWD communicates with the GMU via a bidirectional link that allows the GMU to pause the dispatch of new grids and to hold pending and suspended grids until needed. The GMU also has a direct connection to the Kepler SMX units to permit grids that launch additional work on the GPU via Dynamic Parallelism to send the new work back to GMU to be prioritized and dispatched. If the kernel that dispatched the additional workload pauses, the GMU will hold it inactive until the dependent work has completed. [2]
NVIDIA GPUDirect
NVIDIA GPUDirect™ is a capability that enables GPUs within a single computer, or GPUs in different servers located across a network, to directly exchange data without needing to go to CPU/system memory. The RDMA feature in GPUDirect allows third party devices such as SSDs, NICs, and IB adapters to directly access memory on multiple GPUs within the same system, significantly decreasing the latency of MPI send and receive messages to/from GPU memory. It also reduces demands on system memory bandwidth and frees the GPU DMA engines for use by other CUDA tasks. Kepler GK110 also supports other GPUDirect features including Peer‐to‐Peer and GPUDirect for Video.
Products
GeForce 700 (7xx) series
The GeForce 700 series for desktop architecture. Cheaper and lower performing products are expected to be released in the future. Kepler support 11.1 features with 11_0 feature level through the DirectX 11.1 API, however Nvidia did not enable four non-gaming features in Hardware in Kepler (for 11_1).[3][4]
- 1 Shader Processors : Texture mapping units : Render output units
- 2 Pixel fillrate is calculated as the number of ROPs multiplied by the base core clock speed
- 3 Texture fillrate is calculated as the number of TMUs multiplied by the base core clock speed.
- 4 Single precision performance is calculated as 2 times the number of shaders multiplied by the base core clock speed.
- 5 Double precision performance of the GTX TITAN & GTX TITAN BLACK is either 1/3 or 1/24 of single-precision performance depending on a user-selected configuration option in the driver that boosts single-precision performance if double-precision is set to 1/24 of single-precision performance,[5] while other Kepler chips' double precision performance is fixed at 1/24 of single-precision performance.[6] GeForce 700 series Maxwell chips' double precision performance is 1/32 of single-precision performance.[7]
Model | Launch | Code name | Fab (nm) | Bus interface | Memory (MiB) | Core config1 | Clock speeds | Fillrate | Memory | API support (version) | Processing Power (GFLOPS) | GFLOPS/W Single Precision | TDP (watts) | Release Price (USD) | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Base core clock (MHz) | Boost core clock (MHz) | Memory (MT/s) | Pixel (GP/s)2 | Texture (GT/s)3 | Bandwidth (GB/s) | Bus type | Bus width (bit) | DirectX | OpenGL | OpenCL | Single precision4 | Double precision5 | ||||||||||
GeForce GTX 750 [8] | February 18, 2014 | GM107 | 28 | PCIe 3.0 x16 | 1024 | 512:32:16 | 1020 | 1085 | 5000 | 16.3 | 32.6 | 80 | GDDR5 | 128 | 11.0 | 4.4 | 1.1 | 1044 | 32.6 | 19 | 55 | $119 |
GeForce GTX 750 Ti [9] | February 18, 2014 | GM107 | 28 | PCIe 3.0 x16 | 1024 2048 |
640:40:16 | 1020 | 1085 | 5400 | 16.3 | 40.8 | 86.4 | GDDR5 | 128 | 11.0 | 4.4 | 1.1 | 1306 | 40.8 | 21.8 | 60 | $149 |
GeForce GTX 760 192-bit [10] | Unknown | GK104 | 28 | PCIe 3.0 x16 | 1536 3072 |
1152:96:24 | 823 | 888 | 5808 | 19.8 | 79 | 134 | GDDR5 | 192 | 11.0 | 4.4 | 1.1 | 1896 | 79 | 14.6 | 130 | OEM |
GeForce GTX 760 [11] | June 25, 2013 | GK104 | 28 | PCIe 3.0 x16 | 2048 4096 |
1152:96:32 | 980 | 1033 | 6008 | 31.4 | 94.1 | 192 | GDDR5 | 256 | 11.0 | 4.4 | 1.1 | 2258 | 94.1 | 13.3 | 170 | $249 |
GeForce GTX 760 Ti [12] | Unknown | GK104 | 28 | PCIe 3.0 x16 | 2048 | 1344:112:32 | 915 | 980 | 6008 | 29.3 | 103 | 192 | GDDR5 | 256 | 11.0 | 4.4 | 1.1 | 2460 | 103 | 14.5 | 170 | OEM |
GeForce GTX 770 [13] | May 30, 2013 | GK104 | 28 | PCIe 3.0 x16 | 2048
4096 |
1536:128:32 | 1046 | 1085 | 7008 | 33.5 | 134 | 224 | GDDR5 | 256 | 11.0 | 4.4 | 1.1 | 3213 | 134 | 14.0 | 230 | $399[14] |
GeForce GTX 780 [15] | May 23, 2013 | GK110 | 28 | PCIe 3.0 x16 | 3072 | 2304:192:48 | 863 | 900 | 6008 | 41.4 | 166 | 288 | GDDR5 | 384 | 11.0 | 4.4 | 1.1 | 3977 | 166 | 15.9 | 250 | $649[14] |
GeForce GTX 780 Ti [16] | November 7, 2013 | GK110 | 28 | PCIe 3.0 x16 | 3072 | 2880:240:48 | 876 | 928 | 7000 | 42.0 | 210 | 336 | GDDR5 | 384 | 11.0[17] | 4.4 | 1.1 | 5046 | 210 | 20.2 | 250 | $699[14] |
GeForce GTX Titan [18] | February 19, 2013 | GK110 | 28 | PCIe 3.0 x16 | 6144 | 2688:224:48 | 837 | 876 | 6008 | 40.2 | 188 | 288 | GDDR5 | 384 | 11.0[19] | 4.4 | 1.1 | 4500 | 1500 | 18.0 | 250 | $999 |
GeForce GTX Titan Black [20] | February 18, 2014 | GK110 | 28 | PCIe 3.0 x16 | 6144 | 2880:240:48 | 889 | 980 | 7000 | 42.7 | 213 | 336 | GDDR5 | 384 | 11.0 | 4.4 | 1.1 | 5121 | 1707 | 20.5 | 250 | $999 |
GeForce GTX 790 | Unknown | GK110 | 28 | PCIe 3.0 x16 | 6144 | 4992:480:96 | Unknown | Unknown | Unknown | Unknown | Unknown | Unknown | GDDR5 | 384 | 11.0 | 4.4 | 1.1 | Unknown | Unknown | Unknown | <300 | Unknown |
GeForce 700M (7xxM) series
Some implementations may use different specifications.
Model | Launch | Code name | Fab (nm) | Bus interface | Memory (MiB) | Core config1 | Clock speed | Fillrate | Memory | API support (version) | Processing Power2 (GFLOPS) |
TDP (watts) | Notes | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Core (MHz) | Shader (MHz) | Memory (MT/s) | Pixel (GP/s) | Texture (GT/s) | Bandwidth (GB/s) | Bus type | Bus width (bit) | DirectX | OpenGL | OpenCL | ||||||||||
GeForce 705M [21] | June 1, 2013 | GF119 | 40 | PCIe 2.0 x16 | up to 2048 | 48:8:4 | 775 | 1550 | 1800 | ? | ? | ? | DDR3 | 64 | 11 | 4.1 | 1.1 | ? | 12W | Rebadged 520M |
GeForce 710M [22] | April 1, 2013 | GF117 | 28 | PCIe 2.0 x16 | up to 2048 | 96:16:4 | 775 | 1550 | 1800 | ? | ? | 14.4 | DDR3 | 64 | 11 | 4.3 | 1.1 | ? | ? | |
GeForce GT 720M [23] | April 1, 2013 | GF117//GK208 | 28 | PCIe 2.0 x16 | up to 2048 | 96:16:4/192:16:8 | 800 | 1600 | 2000 | ? | ? | 16.0 | DDR3 | 64 | 11 | 4.3 | 1.2 | ? | ? | |
GeForce GT 730M [24] | April 1, 2013 | GK107/GK208 | 28 | PCIe 3.0 x16/2.0 x8 | up to 4096 | 384:32:16(8) | 725 | 725 | 1800 - 4000 | ? | ? | 14.4 - 64.0 | DDR3/GDDR5 | 64/128 | 11 | 4.4 | 1.2 | ? | ? | |
GeForce GT 735M [25] | April 1, 2013 | GK208 | 28 | PCIe 2.0 x8 | up to 2048 | 384:32:8 | 889 | 889 | 2000 | ? | ? | 16.0 | DDR3 | 64 | 11 | 4.4 | 1.2 | ? | ? | |
GeForce GT 740M [26] | April 1, 2013 | GK107/GK208 | 28 | PCIe 3.0 x16/2.0 x8 | up to 2048 | 384:32:16(8) | 810/1033 | 810/1033 | 1800/3600 | ? | ? | 14.4 - 57.6 | DDR3/GDDR5 | 128/64 | 11 | 4.4 | 1.2 | ? | ? | |
GeForce GT 745M [27] | April 1, 2013 | GK107 | 28 | PCIe 3.0 x16 | up to 2048 | 384:32:16 | 837 | 837 | 2000 - 5000 | ? | ? | 32.0 - 80.0 | DDR3/GDDR5 | 128 | 11 | 4.4 | 1.2 | ? | ? | |
GeForce GT 750M [28] | April 1, 2013 | GK107 | 28 | PCIe 3.0 x16 | up to 4096 | 384:32:16 | 967 | 967 | 2000 - 5000 | ? | ? | 32 - 80 | DDR3/GDDR5 | 128 | 11 | 4.4 | 1.2 | ? | ? | |
GeForce GT 755M [29] | Unknown | GK107 | 28 | PCIe 3.0 x16 | up to 2048 | 384:32:16 | 980? | 980? | 5400 | ? | ? | 86.4 | GDDR5 | 128 | 11 | 4.4 | 1.1 | ? | ? | |
GeForce GTX 760M [30] | May 30, 2013 | GK106 | 28 | PCIe 3.0 x16 | 2048 | 768:64:16 | 657 | 657 | 4008 | ? | ? | 64.1 | GDDR5 | 128 | 11 | 4.4 | 1.1 | ? | ? | |
GeForce GTX 765M [31] | May 30, 2013 | GK106 | 28 | PCIe 3.0 x16 | 2048 | 768:64:16 | 850 | 850 | 4008 | ? | ? | 64.1 | GDDR5 | 128 | 11 | 4.4 | 1.1 | ? | ? | |
GeForce GTX 770M [32] | May 30, 2013 | GK106 | 28 | PCIe 3.0 x16 | 3072 | 960:80:24 | 811 | 811 | 4008 | ? | ? | 96.2 | GDDR5 | 192 | 11 | 4.4 | 1.1 | ? | ? | |
GeForce GTX 780M [33] | May 30, 2013 | GK104 | 28 | PCIe 3.0 x16 | 4096 | 1536:128:32 | 823 | 823 | 5000 | ? | ? | 160.0 | GDDR5 | 256 | 11 | 4.4 | 1.1 | ? | ? |
Chipset table
See also
- GeForce 400 Series
- GeForce 500 Series
- GeForce 600 Series
- GeForce 800 Series
- GeForce 900 Series
- Nvidia Quadro
- Nvidia Tesla
References
- ^ a b c d e "NVIDIA Launches Tesla K20 & K20X: GK110 Arrives At Last". AnandTech. 11/12/2012.
{{cite web}}
: Check date values in:|date=
(help) - ^ "NVIDIA-Kepler-GK110-Architecture-Whitepaper" (PDF).
{{cite web}}
: Cite has empty unknown parameter:|1=
(help) - ^ NVIDIA Kepler not fully compliant with DirectX 11.1
- ^ Nvidia Doesn't Fully Support DirectX 11.1 with Kepler GPUs, But… - Bright Side Of News
- ^ GK110 The True Tank - Nvidia GeForce GTX Titan 6 GB GK110 On A Gaming Card
- ^ Nvidia GeForce GTX 780 Ti Review GK110, Fully Unlocked - GK110, Unleashed The Wonders Of Tight Binning
- ^ Smith, Ryan; T S, Ganesh (February 18, 2014). "The NVIDIA GeForce GTX 750 Ti and GTX 750 Review: Maxwell Makes Its Move". AnandTech. p. 5. Retrieved February 18, 2014.
- ^ GeForce GTX 750 | Specifications | GeForce
- ^ GeForce GTX 750 Ti | Specifications | GeForce
- ^ GeForce GTX 760 192-bit | Specifications | GeForce
- ^ GeForce GTX 760 | Specifications | GeForce
- ^ GeForce GTX 760 Ti | Specifications | GeForce
- ^ GeForce GTX 770 | Specifications | GeForce
- ^ a b c http://www.bit-tech.net/news/hardware/2013/10/28/nvidia-geforce-gtx-780-ti-price-and-release/1
- ^ GeForce GTX 780 | Specifications | GeForce
- ^ GeForce GTX 780 Ti | Specifications | GeForce
- ^ http://www.techpowerup.com/gpudb/2512/geforce-gtx-780-ti.html
- ^ GeForce GTX TITAN | Specifications | GeForce
- ^ http://www.techpowerup.com/gpudb/1996/geforce-gtx-titan.html
- ^ GeForce GTX Titan Black | Specifications | GeForce
- ^ GeForce 705M | Specifications | GeForce
- ^ GeForce 710M | Specifications | GeForce
- ^ GeForce GT 720M | Specifications | GeForce
- ^ GeForce GT 730M | Specifications | GeForce
- ^ GeForce GT 735M | Specifications | GeForce
- ^ GeForce GT 740M | Specifications | GeForce
- ^ GeForce GT 745M | Specifications | GeForce
- ^ GeForce GT 750M | Specifications | GeForce
- ^ GeForce GT 755M | Specifications | GeForce
- ^ GeForce GTX 760M | Specifications | GeForce
- ^ GeForce GTX 765M | Specifications | GeForce
- ^ GeForce GTX 770M | Specifications | GeForce
- ^ GeForce GTX 780M | Specifications | GeForce
External links
- GK110 Architecture Whitepaper
- GTX 750 Ti Whitepaper
- Introducing the GeForce GTX TITAN
- Introducing The GeForce GTX 780
- Introducing The GeForce GTX 770
- Introducing The GeForce GTX 760: A Mid-Range GPU With High-End Features
- GeForce GTX 750 Class GPUs: Serious Gaming, Incredible Value
- Introducing Our New GeForce GTX 700M, Kepler-Powered Notebook GPUs
- GeForce GTX TITAN BLACK
- GeForce GTX TITAN
- GeForce GTX 780 Ti
- GeForce GTX 780
- GeForce GTX 770
- GeForce GTX 760 Ti (OEM)
- GeForce GTX 760
- GeForce GTX 760 192-bit(OEM)
- GeForce GTX 750 Ti
- GeForce GTX 750
- GeForce GX 780M
- GeForce GX 770M
- GeForce GX 765M
- GeForce GX 760M
- GeForce GT 750M
- GeForce GT 745M
- GeForce GT 740M
- GeForce GT 735M
- GeForce GT 730M
- A New Dawn
- Nvidia Nsight