Graphics processing unit
A Graphics Processing Unit or GPU (also occasionally called Visual Processing Unit or VPU) is a dedicated graphics rendering device for a personal computer or game console. Modern GPUs are very efficient at manipulating and displaying computer graphics, and their highly-parallel structure makes them more effective than typical CPUs for a range of complex algorithms.
A GPU implements a number of graphics primitive operations in a way that makes running them much faster than drawing directly to the screen with the host CPU. The most common operations for early 2D computer graphics include the BitBLT operation, usually in special hardware called a "blitter", and operations for drawing rectangles, triangles, circles and arcs. Modern GPUs also have support for 3D computer graphics, and typically include digital video-related functions as well.
History
Modern GPUs are descended from the monolithic graphic chips of the late 1970s and 1980s. These chips had limited BitBLT support in the form of sprites (if they had BitBLT support at all), and usually had no shape-drawing support. Some GPUs could run several operations in a display list, and could use DMA to reduce the load on the host processor; an early example was the ANTIC co-processor used in the Atari 800 and Atari 5200. In the late 1980s and early 1990s, high-speed, general-purpose microprocessors became popular for implementing high-end GPUs. Several (very expensive) graphics boards for PCs and computer workstations used digital signal processor chips (like TI's TMS340 series) to implement fast drawing functions, and many laser printers shipped with a PostScript raster image processor (a special case of a GPU) running on a RISC CPU like the AMD 29000.
As chip process technology improved, it eventually became possible to move drawing and BitBLT functions onto the same board (and, eventually, into the same chip) as a regular frame buffer controller such as VGA. These cut-down "2D accelerators" weren't as flexible as microprocessor-based GPUs, but were much easier to make and sell. The Commodore Amiga was the first mass-market computer to include a blitter in its video hardware, and IBM's 8514 graphics system was one of the first PC video cards to implement 2D primitives in hardware.
By the early 1990s, the rise of Microsoft Windows sparked a surge of interest in high-speed, high-resolution 2D bitmapped graphics (which had previously been the domain of Unix workstations and the Apple Macintosh). For the PC market, the dominance of Windows meant PC graphics vendors could now focus development effort on a single programming interface, GDI.
In 1991, S3 Graphics introduced the first single-chip 2D accelerator, the S3 86C911 (which its designers named after the Porsche 911 as an indication of the speed increase it promised). The 86C911 spawned a host of imitators: by 1995, all major PC graphics chip makers had added 2D acceleration support to their chips. By this time, fixed-function Windows accelerators had surpassed expensive general-purpose graphics coprocessors in Windows performance, and these coprocessors faded away from the PC market.
Throughout the 1990s, 2D GUI acceleration continued to evolve. As manufacturing capabilities improved, so did the level of integration of graphics chips. Video acceleration became popular as standards such as VCD and DVD arrived, and the Internet grew in popularity and speed. Additional APIs arrived for a variety of tasks, such as Microsoft's WinG graphics library for Windows 3.x, and their later DirectDraw interface for hardware acceleration of 2D games within Windows 95 and later.
In the mid-1990s, computer CPUs were becoming powerful enough to handle real-time 3D graphics. Graphics chip manufacturers scrambled to be the first to offer hardware 3D acceleration to their product line-ups. Notable failed first-tries were the S3 ViRGE, ATI Rage, and Matrox Mystique. These chips were essentially previous-generation 2D accelerators with 3D features bolted on. Many were even pin-compatible with the earlier-generation chips for ease of implementation and minimal cost. Initially, performance 3D graphics were possible only with separate add-on boards dedicated to accelerating 3D functions (and lacking 2D GUI acceleration entirely) such as the 3DFX Voodoo. However, as manufacturing technology again progressed, video, 2D GUI acceleration, and 3D functionality were all integrated into one chip. Rendition's Verite chipsets were the first to do this well enough to be worthy of note.
As DirectX advanced steadily from a rudimentary (and perhaps tedious) API for game programming to become the leading 3D graphics programming interface, 3D accelerators evolved seemingly exponentially as years passed. Direct3D 5.0 was the first version of the burgeoning API to really dominate the market and stomp out many of the proprietary interfaces. Direct3D 7.0 introduced support for hardware-accelerated transform and lighting (T&L). 3D accelerators moved beyond of being just simple rasterizers to add another significant hardware stage to the 3D rendering pipeline. The nVidia GeForce 256 (a.k.a. NV10) was the first card on the market with this capability. Hardware T&L set the precedent for later and far more flexible and programmable pixel shader and vertex shader units.
With the advent of the DirectX 8.0 API and similar functionality in OpenGL, GPUs added programmable shading to their capabilities. Each pixel could now be processed by a short program that could include additional image textures as inputs, and each geometric vertex could likewise be processed by a short program before it was projected onto the screen. nVidia also held the crown for being the first to market with a chip capable of programmable shading, the GeForce 3 (a.k.a NV20). By October 2002, with the introduction of the ATI Radeon 9700 (a.k.a. R300), the world's first Direct3D 9.0 accelerator, pixel and vertex shaders could implement looping and lengthy floating point math, and in general were quickly becoming as flexible as CPUs, and orders of magnitude faster for image-array operations.
Today, parallel GPUs have begun making computational inroads against the CPU, and a subfield of research, dubbed GPGPU for General Purpose Computing on GPU has found its way into fields as diverse as oil exploration, scientific image processing, and even stock options pricing determination.
Current GPU capabilities
Modern GPUs use most of their transistors to do calculations related to 3D computer graphics. They were initially used to accelerate the memory-intensive work of texture mapping and rendering polygons, later adding units to accelerate geometric calculations such as translating vertices into different coordinate systems. Recent developments in GPUs include support for programmable shaders which can manipulate vertices and textures with many of the same operations supported by CPUs, oversampling and interpolation techniques to reduce aliasing, and very high-precision color spaces. Because most of these computations involve matrix and vector operations, engineers and scientists have increasingly studied the use of GPUs for non-graphical calculations.
Because all these applications exceed an actual GPU's usage target, a new term, GPGPU is usually employed to describe them. While GPGPUs are the same chips as GPUs, there is increased pressure on manufacturers from "GPGPU users" to improve hardware design, usually focusing on adding more flexibility to the programming model.
Although modern PC GPUs feature some progammability in the form of 3D shaders (as seen by the Nvidia CG toolkit and ATI Rendermonkey APIs), this should not be confused with general software programmability. Instead, these units may operate as SIMD or sometimes MIMD parallel processors. Operations on rectangular arrays of colored pixels are a prime application for these, and many array algorithms – though not all – can be adapted to GPUs for extremely high throughput.
In addition to the 3D hardware, today's GPUs include basic 2D acceleration and frame buffer capabilities (usually with a VGA compatibility mode). In addition, most GPUs made since 1995 support the YUV color space and hardware overlays (important for digital video playback), and many GPUs made since 2000 support MPEG primitives such as motion compensation and iDCT. The newest graphics cards even decode high-definition video on the card, taking some load off the central processing unit.
The location of a typical modern GPU in a PC is on a separate graphics card, connected to the CPU and main RAM via some sort of expansion bus (like AGP or PCI Express). A GPU will typically be able to address an amount of high-performance VRAM located on the expansion card. For example, many modern cards have 512 MiB of VRAM, with some having much more. Alternatively, many motherboards have a GPU integrated into the northbridge that uses main memory as a frame buffer and requires the CPU to aid in frame rendering. This is usually a cheaper solution than an independent GPU, but will have lower performance. Integrated motherboards may or may not have a slot for a stand-alone graphics card.The latest integrated motherboards from Intel have 128MB VRAM and also a slot for another graphic card. However, only one graphics processor can run at a time.
GPU manufacturers
- NVIDIA Corporation
- ATI Technologies
- 3Dlabs
- Matrox
- XGI Technology Inc.
- Intel
- 3Dfx (now part of NVIDIA)