R600 (ASIC)
Release date | 2006-2007 |
---|---|
Codename | Pele |
Cards | |
Entry-level | Radeon HD 2400 |
Mid-range | Radeon HD 2600 |
High-end | Radeon HD 2900 |
DirectX | 10.0, Shader Model 4.0 |
The graphics processing unit (GPU) codenamed R600 is the foundation of the Radeon HD 2000 series and the FireGL 2007 series video cards developed by ATI Technologies.
It features unified shaders and is compatible with Direct3D 10.0's Shader Model 4.0 along with OpenGL 2.0.[1] The first product of the line, the Radeon HD 2900 XT, was launched on May 14, 2007, with variants for other market segments subsequently throughout 2007.
Architecture
Unified shaders
The "R600" is the first personal computer graphics processing unit (GPU) from ATI based on a unified shader architecture. It is ATI's second generation unified shader design and is based on the "Xenos" GPU implemented in the Xbox 360 game console, which used the world's first such shader architecture. Previous GPU architectures implement separate processors for each type of graphics function. A unified architecture leverages many flexible processors which can be scheduled to process a variety of shader types, thereby significantly increasing GPU throughput (dependent on application instruction mix as noted below). The R600 core processes vertex, geometry, and pixel shaders as outlined by the Direct3D 10.0 specification for Shader Model 4.0 in addition to full OpenGL 2.1 compliancy [2], but only with OpenGL 2.0 support.
The new unified shader functionality is based upon a Very long instruction word (VLIW) architecture in which the core executes operations in parallel.[2] The R600 uses 64 superscalar unified shader clusters, each consisting of 5 stream processing units for a total of 320 stream processing units [2]. The RV610 and RV630 variants have some of the shaders removed from the array, containing a total of 40 (5x8) and 120 (5x24) stream processors, respectively. Each of the first 4 stream processing units is able to retire a finished single precision floating point MAD (or ADD or MUL) instruction per clock, dot product (dp, and special cased by combining ALUs), and integer ADD [3]. The fifth unit is more complex and can additionally handle special transcendental functions such as sine and cosine [3]. Each of the 64 shader clusters can execute 6 instructions per clock cycle (peak), consisting of 5 shading instructions plus 1 branch [3].
Notably, the VLIW architecture brings with it some classic challenges inherent to VLIW designs, namely that of maintaining optimal instruction flow [2]. Additionally, the chip cannot co-issue instructions when one is dependent on the results of the other. Performance of the GPU is highly dependent on the mixture of instructions being used by the application and how well the real-time compiler in the driver can organize said instructions [3].
Hardware tessellation
The GPU is equipped with an extra feature which is not part of the current DirectX 10.0 specification. It contains programmable tessellation units, similar to those within the Xenos GPU (codenamed "C1") also developed by ATI. This unit allows a developer to take a simple polygon mesh and subdivide it based on a curved surface evaluation function, with different tessellation forms as Bézier surfaces with N-patches, B-splines and NURBS, and even some subdivision surface techniques, which usually comes with a displacement map texture [4]. Essentially, this allows a simple, low-polygon model to be increased dramatically in polygon density in real-time with minimized performance loss. Scott Wasson of Tech Report noted during an AMD demo of the technology that the resulting model was so dense with millions of polygons that it appeared to be solid [2].
This unit is reminiscent of ATI's earlier "TruForm" technology, used initially in the Radeon 8500, which performed a similar function in hardware [5]. While this tessellation hardware is not part of the current OpenGL or Direct3D requirements, and competitors such as the GeForce 8 series lack similar hardware, Microsoft has included Tesellation as part of their D3D10.1 future plans.[6]. The "TruForm" technology from the past received little attention from software developers and was only utilized in a few game titles (such as Madden NFL 2004, Serious Sam, Unreal Tournament 2003 and 2004, and unofficially Morrowind), because it was not a feature shared with NVIDIA GPUs which had a competing Tesellation solution using Quintic-RT patches which met with even less support from developers [7]. Since the Xenos contains similar hardware, and Microsoft sees hardware surface tessellation as a major GPU feature with proposed implementation of hardware tessellation support in future DirectX releases (presumably DirectX 11) [6][4], dedicated hardware tessellation units may receive increased developer awareness in future titles. It remains to be seen whether ATI's implementation will be compatible with the eventual DirectX standard.
Ultra threaded dispatch processor
Although the R600 is a significant departure from previous designs, it still shares many features with its predecessors [2]. The "Ultra-Threaded Dispatch Processor" is a major architectural component of the R600 core, just as it was with the Radeon X1000 GPUs. This processor manages a large number of in-flight threads of three distinct types (vertex, geometry, and pixel shaders) and switches amongst them as needed [2]. With a large number of threads being managed simultaneously it is possible to reorganize thread order to optimally utilize the shaders. Basically, the dispatch processor is the "boss" who decides what goes in the other parts of the R600 and attempts to keep processing efficiency as high as possible. There are lower levels of "management" as well; each SIMD array of 80 stream processors has its own sequencer and arbiter. The arbiter decides which thread to process next, while the sequencer attempts to reorder instructions for best possible performance within each thread [2].
Texturing, memory, and anti-aliasing
Texturing and final output aboard the R600 core is similar but also distinct from R580. R600 is equipped with 4 texture units that are decoupled (independent) from the shader core, like in the R520 and R580 GPUs [2]. The render output units (ROPs) of R600 function differently in many ways than R580 core and predecessors, however. A new addition is support for up to 8× multi-sample anti-aliasing (MSAA) using programmable sample grids. Also new is the capability to filter FP16 textures, popular with HDR lighting, at full-speed. This totals 16 pixels per clock for FP16 textures, while higher precision FP32 textures filter at half-speed (8 pixels per clock) [2]. R600 can also perform trilinear and anisotropic filtering on all texture formats. The Radeon X1000 series performed this filtering within the pixel shader processors, which was dramatically more time consuming [2].
Anti-aliasing capabilities are more robust on R600 than on the R520 series. In addition to the ability to perform 8× MSAA, up from 6× MSAA on the R300 through R580, R600 has a new "custom filter anti-aliasing" (CFAA) mode. CFAA refers to an implementation of non-box filters that look at pixels around the particular pixel being processed in order to calculate the final color and anti-alias the image [3]. This feature is performed by shader processing, instead of entirely in the ROPs, as anti-aliasing has traditionally been implemented. This brings greatly enhanced programmability because the filters can be customized, but may also bring potential performance issues because of the use of shader resources. As of launch of R600, CFAA utilizes wide and narrow tent filters. With these, samples from outside the pixel being processed are weighted linearly based upon their distance from the centroid of that pixel, with the linear function adjusted based on the wide or narrow filter chosen [3].
Internal functional units of R600 core are connected by an internal 1024-bit bi-directional ring bus (512-bit read and 512-bit write) which wraps around the processor. The ring bus connects at various points to the external memory chips via 8 64-bit memory channels for a total bus width of 512-bits on the 2900 XT. [2]. The large bus width allows the 2900 XT to use lower clocked memory while still giving a large amount of memory bandwidth.
Video processing and miscellaneous features
ATI has built-in a HDMI interface with 5.1 audio playback support. The "Rage Theater" chip used on the Radeon X1000 series was replaced with the digital "Rage Theater 200" chip, providing VIVO capabilities. Among other details, the Radeon HD 2000 series graphics cards features dual-link DVI output with HDCP, and provides a specially designed DVI-to-HDMI dongle for HDMI output that carries both audio and video.
The most recent R600 members, RV610 and RV630, feature ATI's Unified Video Decoder for hardware decoding for MPEG2, MPEG4 and VC-1 video streams, which itself being the major part of AVIVO HD technology. In terms of functionality, NVIDIA's Purevideo 2 offer similar hardware video-acceleration, with UVD going one step further thanks to greater VC-1 offloading.
All Radeon HD 2000 series graphics cards support native CrossFire. CrossFire efficiency has improved with the R600 core and shows performance approaching the theoretical maximum of twice the performance of a single card [2][8].
While some of the architecture of R600 is similar to Xenos, R600 does not feature the embedded DRAM (eDRAM) frame buffer used with Xenos. Xenos' eDRAM is designed tightly around the limited resolutions at which the Xbox 360 operates. Personal computers operate at maximum efficiency at a much wider range of resolutions, which would require a significantly larger amount of eDRAM to be effective.
Lineup
Desktop products
The R600 family is called the Radeon HD 2000 series, with the enthusiast segment being the "Radeon HD 2900 series" which currently comprise the Radeon HD 2900 XT with GDDR3 memory released on May 14, and the higher clocked GDDR4 version in early July. The mainstream and value segment products are the "Radeon HD 2600" and the "Radeon HD 2400" series respectively, both launched June 28th, 2007.[9] Previously there were no HD 2000 series products being offered in the performance segment while AMD using models from the previous generation to address that target market, the situation has not been changed until the release of variants of the Radeon HD 2900 series, the Radeon HD 2900 Pro and GT, which filled the gap of the performance market for a short period of time.
The desktop product lineup will be refreshed as the arrival of the performance market-oriented Radeon HD 3800 series based on the "die shrink" version of R600 on a 55 nm process. Two variants, the Radeon HD 3850 and the 3870 will be available mid-Novmeber 2007. There will be an enthusiast variant, the Radeon HD 3870 X2, with two RV670 cores on single PCB, to be launched in February 2008.
Mobile products
Both the Mobility HD 2400 & HD 2600 series share the same feature set support as their desktop counterparts, as well as the addition of the battery conserving PowerPlay 7.0 features which is augmented from the previous generation's PowerPlay 6.0.
The Mobility Radeon HD 2400 is offered in two model variants; the standard HD 2400[10] and the HD 2400 XT[11]. The HD 2400 is currently shipping in laptops like the Toshiba P200 series [12].
The Mobility Radeon HD 2600 is also available in the same two flavours; the plain HD 2600 [13] and at the top of the current mobility lineup the HD 2600 XT. [14] The HD 2600 is currently shipping as an option for Toshiba A200 notebooks [15] and the HD 2600 XT is available in mobile solutions like the HP HDX Entertainment Notebook.[16]
The Mobility Radeon HD 2300 is a value product which includes UVD in silica but lacks unified shader architecture and DirectX 10.0 / SM 4.0 support, limiting support to DirectX 9.0c / SM 3.0 using the more traditional architecture of the previous generation.
Variants
Radeon HD 2900
The Radeon HD 2900 series is ATI's high-end product with 320 stream processors, spanning a 420 mm² die size [17]. The Radeon HD 2900 XT is the first graphics card product to implement digital PWM onboard, specifically 7-phase PWM.
The R600 core used in the HD 2900 lacks the ATI Unified Video Decoder (UVD) required for hardware acceleration of certain types of HD video [18]. Nonetheless, the card is fully capable of playing any HD video format; however, shaders are utilized for the decoding process. Initially there was much confusion as to whether or not the product included dedicated video processor hardware, due in part to statements that it supported the software program AVIVO HD. Many reviewers and subsequent readers/consumers interpreted this as meaning the HD2900 incorporated the same UVD hardware as found in the HD 2400 & HD 2600 series, despite some sites noting this difference at launch time[19], weeks before the issue first gained traction as a result of a TechReport article.[20] This confusion and subsequent discussions prompted AMD to make a formal statement designed to clarify exactly what UVD was available in which models [21][22]. The HD 2900 XT video playback capabilities are similar to those of the previous X1000 cards with AVIVO capabilities.
Starting August, 2007 some system builders including Falcon Northwest received the 1 GB GDDR4 (with Samsung 0.9 ns (K4U52324QE-BC09) GDDR4) version of the Radeon HD 2900 XT. This was incorrectly referred to as the "Radeon HD 2900 XTX". [23]
Variants of the series include the Radeon HD 2900 Pro and the Radeon HD 2900 GT. The Radeon HD 2900 Pro uses the same R600 GPU, but is clocked lower at 600 MHz core and 800 MHz memory (1600 MHz effective). This variant is configured with 512 MB or 1 GB (GDDR3/GDDR4) of video memory and the same 512-bit memory controller as the Radeon HD 2900 XT instead of the previously rumoured 256-bit memory controller [24]. The Radeon HD 2900 GT is a 240 stream processor variant clocked the same as the HD 2900 Pro. It has 256 MB of video memory on a 256-bit interface.
Radeon HD 2600
The Radeon HD 2600 series is a line of mainstream products with 120 stream processors, GDDR4 support, AVIVO HD with UVD, 128-bit memory ring bus [25] and 4-phase digital PWM, spanning a die size of 153 mm² [26]. Neither the GDDR3 nor GDDR4 reference PCI-E designs require additional power connectors whereas the HD 2600 XT AGP variants require additional power through either molex or 6-pin power connectors.[27] Official claims state the Radeon HD 2600 series consumes as little as 45 W of power [citation needed].
Another variant incorporates two RV630 cores onto a single PCB with a PCI-E bridge splitting the PCI-E x16 bandwidth into two groups of PCI-E x8 lanes (each 2.0 Gb/s). This functionally provides a CrossFire configuration on one video card with a total of four DVI output (HDMI output via dongle) with HDCP. AMD calls this product the "Radeon HD 2600 X2" as seen by some vendors and as observed inside the INF file of Catalyst 7.9 version 8.411. Sapphire and other vendors including PowerColor and GeCube have either announced [28] or demonstrated their respective "Crossfire on a card" products. Catalyst 7.9 added support for this hardware in September 2007. However, AMD did not provide much publicity to promote it. A vendor may offer cards containing 512 MB or 1 GB of video memory. Although the memory technology utilized is at a vendor's discretion, most vendors have opted for GDDR3 and DDR2 due to lower manufacturing cost and positioning of this product for the mainstream rather than performance market segment.
Radeon HD 2400/2350
Low-end products with 40 stream processors with AVIVO HD and UVD, not implemented with a ring bus memory interface, and a 64-bit memory bus width [25], spanning a die size of 85 mm² [29]. The official PCB design implements only a passive cooling heatsink instead of a fan, and official claims of power consumption are as little as 35 W. The RV610 core used in Radeon HD 2400 series has 16 KB unified vertex/texture cache away from dedicated vertex cache and L1/L2 texture cache used in HD 2600 and HD 2900 products.
Reports has that the first batch of the RV610 core (silicon revision A12), only being released to system builders, has a bug that hindered the UVD from working properly, but other parts of the die operated normally. Those products were supported since the release of Catalyst 7.10 driver, which were named as Radeon HD 2350 series. [30]
Radeon HD 3800
The Radeon HD 3000 series is based on the RV670 graphics chip, manufactured on a 55 nm fabrication process with 256-bit memory controller, die size at 192 mm² with 666 million transistors [31]. The Radeon HD 3000 series supports DirectX 10.1 and Shader Model 4.1 [32] with double-precision floating-point operations support, the UVD has also been implemented on-die, providing full hardware decoding of VC-1 and H.264 video streams.
The Radeon HD 3800 series will also see the implementation of power state controller as well as the PowerPlay technology for the desktop graphics, allowing Catalyst Control Center to monitor GPU utilizations and further reduce power draw of the graphics by switching states of the GPU core for different usage scenarioes with different performance settings [33].
The move of changing the product branding was believed as a reluctance of continuing the Radeon HD 2000 series branding [citation needed], which lagged behind the original release schedule for over half a year while having inferior benchmark results against it's designated competitor, the GeForce 8800 series [34], it was also believed that the change is also due to more circuitries were implemented which was benefited from a "die shrink" (the die being manufactured on a smaller fabrication process), resulting more circuitries being included as the same time with smaller die size, implying greater performance than the Radeon HD 2900 series.
Also notable is that the naming scheme for Radeon HD 3800 series will be changed. While previous PRO, XT, GT, and XTX suffixes will be eliminated, products will be differentiated by changing the last two digits of the product model number (for instance, HD 3850 and HD 3870, giving the impression that the HD 3870 model having higher performance than HD 3850) [35]. Similar changes to the IGP naming were spotted as well, for the AMD M690T chipset with side-port memory, the IGP is named "Radeon X1270", while for the AMD 690G chipset, the IGP is named "Radeon X1250", as for AMD 690V chipset, the IGP is clocked lower and having less functions and thus named "Radeon X1200". The changes to the naming scheme of video cards starting from the Radeon HD 3800 series are shown below:
Product Category | Model number range (steps of 10)1 | Price range (USD) | Shader amount (VS/PS/SPU)2 | Memory | Outputs | Example products | ||
---|---|---|---|---|---|---|---|---|
Type | Width (bit) |
Size (MiB) | ||||||
Enthusiast /high-end |
800-990 | >$150 | 75-100% | GDDR3, GDDR4 |
256-bit/ 512-bit |
512/1024 | Dual DVI with HDMI/DP (dongle) |
HD 3870 HD 3850 |
Mainstream | 400-790 | $100-$150 | 37.5-75% | DDR2, GDDR3, GDDR4 |
128-bit | 128/256/512 | D-Sub, DVI/ Dual DVI with HDMI/DP (dongle) |
None as of today |
Budget/Value | 000-390 | <$99 | 25-50% | DDR2, GDDR3 |
64-bit | 64/128 (HM: 768/1024) |
D-Sub, DVI with HDMI/DP (Dongle) |
Radeon X1270 (IGP), Radeon X1250 (IGP), Radeon X1200 (IGP) |
- 1 The last two digits denotes variant, similar to the previous suffixes, which "70" is in essence the "XT" variant while "50" is actually the "Pro" variant [31].
- 2 Stream Processors only applicable to Direct 10-class video components (Radeon HD 2000/3000 series).
The Radeon HD 3800 series will also have one more variant, the Radeon HD 3870 X2 to be released in February 2008, featuring two RV670 cores, maximum 1024 MiB GDDR3 or GDDR4, targeting the enthusiast market, replacing the Radeon HD 2900 XT.
Driver support
The latest driver is package version 8.421, Catalyst 7.10 [36]. Introducing software CrossFire for HD 2600 and HD 2400 series video cards. As well as a list of improvements in gameplay experience for single card setup or CrossFire setup on all Radeon HD 2000 series video cards.
The Purple Pill tool issue, which could allow unsigned drivers to be loaded into Windows Vista and tamper with the operating system kernel [37], was resolved in the Catalyst 7.8 release (version 8.401) [38].
Catalyst 7.9, package version 8.411, has added the AVIVO video converter for Windows Vista, and color temperature control in Catalyst Control Center.
Chipset table
See also
References
- ^ Radeon HD 2900 product page, last line: "OpenGL 2.0 support"
- ^ a b c d e f g h i j k l m Wasson, Scott. AMD Radeon HD 2900 XT graphics processor: R600 revealed, Tech Report, May 14, 2007
- ^ a b c d e f Beyond3D review: AMD R600 Architecture and GPU Analysis, retrieved June 2, 2007.
- ^ a b ExtremeTech review
- ^ Witheiler, Matthew. ATi TRUFORM Technology - Powering the next generation Radeon, AnandTech, May 29, 2001.
- ^ a b The Future of DirectX presentation, slide 24-29
- ^ nVidia GeForce3 SDK WhitePaper
- ^ Wilson, Derek. ATI Radeon HD 2900 XT: Calling a Spade a Spade: Multi-GPU Performance - Prey, AnandTech, May 14, 2007.
- ^ HD2400 & HD2600 Press release
- ^ HD 2400 Specs
- ^ HD 2400XT spec
- ^ Toshiba P200 page
- ^ HD 2600 specs
- ^ HD 2600 XT specs
- ^ Toshiba's A200 specs
- ^ HP HDX spec page
- ^ Beyond3D R600 review, retrieved September 25, 2007
- ^ AnandTech image showing AVIVO HD consists of UVD and Advanced Video Processor (AVP)
- ^ EliteBastards' HD2000 preview, retrieved July 23, 2007.
- ^ TechReport UVD article
- ^ AMD press release, the third paragraph.
AMD also wishes to clarify any confusion that may exist regarding the presence of the Unified Video Decoder (UVD) in its ATI Radeon™ HD 2000 series graphics processors. UVD is present in the ATI Mobility Radeon™ HD 2300, the ATI Radeon™ HD 2400, and the ATI Radeon™ HD 2600 series products, but is not present in the ATI Radeon™ HD 2900 series products as it is not needed due to the usage model of this high end product.
— AMD Press release - ^ Huynh, Anh T. & Kubicki, Kristopher. Whoops, ATI Radeon HD 2900 XT Lacks UVD, DailyTech, May 25, 2007.
- ^ Falcon Northwest President Blog on 1GB GDDR4 2900 XT
- ^ Kowaliski, Cyril (2007-09-25). "AMD launches the $249 Radeon HD 2900 Pro". The Tech Report. Retrieved 2007-09-26.
{{cite news}}
: Check date values in:|date=
(help); Cite has empty unknown parameter:|coauthors=
(help) - ^ a b AMD official press release
- ^ Beyond3D RV630 chip reference, retrieved September 25, 2007
- ^ Sapphire HD2K Product Matrix
- ^ Beyond3D report, retrieved September 13, 2007
- ^ Beyond3D RV610 chip reference, retrieved September 25, 2007
- ^ Fudzilla report, retrieved October 31, 2007
- ^ a b Template:Es icon MadboxPC thread, retrieved November 10, 2007
- ^ "HD 2950Pro (RV670) Cards & Specs Revealed". VR-Zone. August 22, 2007.
- ^ Template:Es icon MadboxPC coverage, retrieved November 10, 2007
- ^ Derek Wilson (May 14, 2007). "ATI Radeon HD 2900 XT: Calling a Spade a Spade". AnandTech. Retrieved 2007-11-01.
- ^ "RV670 is Radeon HD 3800 Series". VR-Zone. October 17, 2007.
- ^ Catalyst 7.10 release notes
- ^ DailyTech report
- ^ The Inquirer report