Architecture and Specifications
AMD has been respinning some of their GPUs quite a bit, the oldest one in use is now actually 4 years old. I am happy to report that starting with the 400 series the products will all be based on new chips, all fabbed at 14nm FinFET+. For Polaris, AMD is utilizing the fabs of Samsung and Global Foundries’ 14nm FinFET-based process technology, which is the densest foundry process available. FinFET transistors are crucial to reducing power consumption and enabling operating voltages that are 150mV lower than the previous generation, thereby cutting active power by 30% from a 1V baseline.
For the last five years, graphics processors have relied on 28nm high-k/metal nodes. We are now moving towards 14nm FinFET. Basically, the GPU is based on 36 shader clusters with 64 shader processors each. So that makes a nice 2304 shader processors in total. AMD claims that this is the fully enabled chip, e.g. there would not be more hidden/deactivated shader cores in there. The shader clusters are tied to 32 ROPs. TMUs wise AMD has always used a 4:1 ratio meaning 36 CUs x 4 = 144 texture units, which is a good number. Memory sits on a 256-bit wide bus spread over the 64-bit controllers. Polaris 10 (XT) runs its memory reference at 7000 MHz for the 4GB model and 8000 MHz for the 8 GB model.
|Radeon||R9 Fury X||R9 Nano||R9 390X||RX 480||GeForce GTX 970|
|GPU||Fiji||Fiji||Hawaii / Grenada||Polaris 10/Ellismere||Maxwell|
|Graphics memory||4 GB HBM||4 GB HBM||8 GB GDDR5||4GB / 8GB GDDR5||3.5 GB GDDR5|
|Memory Clock||up-to 500 MHz / 1.0 Gbps||up-to 500 MHz / 1.0 Gbps||6.0 Gbps||7.0/8.0 Gbps||7.0 Gbps|
|GPU Clock Max||1050 MHz||up-to 1000 MHz||1050 MHz||1267MHz||1178 MHz|
|Memory Bandwidth||up-to 512 GB/s||up-to 512 GB/s||384 GB/s||224 GB/s (4GB)
256 GB/s (8GB)
|Power Connectors||2 x 8-pin||1 x 8-pin||1 x 6-pin - 1 x 8-pin||1 x 6-pin||2 x 6-pin|
|Form Factor||Dual slot||Dual slot||Dual slot||Dual slot||Dual slot|
|DirectX 12 Support||Yes||Yes||Yes||Yes||Yes|
Though today's release is about the Radeon RX 480, there will be two more products added to the product stack soon as well. As such, AMD today is actually announcing three products:
- AMD Radeon RX 480 4GB and 8GB - High-end performance with 2304 shader processors running up-to 1266 MHz / 256-bit Memory bus and 7 Gbps for the 4GB model and 8 Gbps GDDR5 for the 8GB model.
- AMD Radeon RX 470 4GB - Mainstream performance with 2048 shader processors / 256-bit Memory bus and 7 Gbps GDDR5.
- AMD Radeon RX 460 2GB/4GB - Entry level performance with 896 shader processors / GDDR5.
Let me chart up a more complex model with specs and overviews, some of the RX 470/460 specs are not yet finalized though yet math brings me to the proper TMU count etc:
Though GCN is now at iteration revision 4 we feel that the GPU architecture has remained very similar and is comparable to the last-generation products, hence the die-shrink at 14nm is where the biggest advantages are to be found as well as supporting more features in line with HDMI 2.0 and DP 1.4.
WattMan is AMD's new overclocking utility that controls GPU voltage, engine clocks, memory clocks, fan speed and temperature. Radeon WattMan is based on Radeon Software features but offers multiple new ways of precise overclocking controls. With the new control over voltage and per state frequency curve for GPU clocks, comprehensive tuning control is now available. With these new controls in Radeon WattMan, the extra benefit lies within the new ability to finely tune the exact experience for your games. Using the new histogram which records and displays the GPU activity, clock speeds, temperature and fan speed, you can visualize and understand how the game/application runs in a single interface, and configure based on that captured data. This complements the initial Radeon Software Crimson Edition launch feature of per profile overclocking, where each detected game can have its own overclocked profile. On launch, that profile’s overclocking settings will take effect and on close they will revert back to the global defaults the user can also set. The simple GPU Clock control is enabled by default and allows adjustments to be made using the curve set by AMD engineers as the generic optimal set for each GPU state, for the average experience expected for the GPU. The adjustments allow for increasing this curve by 0.5% intervals to a higher clock level or reducing this curve to a lower level. Complemented by the per-game profile, this is a simple way to get extra performance or power savings with an easy to use control. With this new control, fan minimum speed, target speed, and min acoustic limit can now be set. Minimum, is the absolute minimum the fan can run at. Target, is the target maximum fan speed the fan will run at if temperature level is not above target. Min Acoustic Limit is the clock limit/threshold where acoustics can be violated. With this new control, Temperature max and target can now be set. Along with Power limit, the new control allows further customization than before. Max temperature is the absolute max temperature before the system clocks are reduced to cool down the GPU. Target is the temperature before the fan speed is raised to cool down the GPU. Power limit boosts or reduces the power sent to the GPU. This can be increased or reduced by +/- 50% (this is specific to Polaris 10 XT). This control is a great place to start optimizing performance if you have better thermal/board design to increase DPM residency without changing the clock curve. Current peak and average GPU activity, temperature, fan, and engine/memory clock speeds can be captured and viewed using the Global WattMan and Profile WattMan pages. Data capture begins and ends when users navigate to and from the WattMan pages, respectively, and varies based on the current graphics workload.
Improved Color Compression
The Radeon XR 480 uses GDDR5 memory, the 8GB model actually has some really good stuff at 8000 MHz (effective), 4GB has 7000 Mhz (effective) and all on a 256-bit wide memory bus. Well, you can never have too much bandwidth so AMD applied some more tricks, color compression being one of them. The GPU’s compression pipeline has a number of different algorithms that intelligently determine the most efficient way to compress the data. One of the most important algorithms is delta color compression. With delta color compression, the GPU calculates the differences between pixels in a block and stores the block as a set of reference pixels plus the delta values from the reference. If the deltas are small then only a few bits per pixel are needed. If the packed together result of reference values plus delta values is less than half the uncompressed storage size, then delta color compression succeeds and the data is stored at half size (2:1 compression). The GPUs include a significantly enhanced delta color compression capability:
- 2:1 compression has been enhanced to be effective more often
- A new 4:1 delta color compression mode has been added to cover cases where the per pixel deltas are very small and are possible to pack into ¼ of the original storage
- A new 8:1 delta color compression mode combines 4:1 constant color compression of 2x2 pixel blocks with 2:1 compression of the deltas between those blocks
With that additional memory bandwidth combined with new advancements in color compression, Polaris GPUs can claim even more bandwidth thanks to color compression. Up-to Series 300 the GPU could handle 2:1 color compression ratios, newly added are 4:1 and 8:1 delta color compression.
Polaris generation and newer products will receive a nice upgrade in terms of monitor connectivity. First off, the cards will get three DisplayPort connectors, one HDMI connector and an optional DVI connector. The days of Ultra High resolution displays are here and AMD is adapting to it. The HDMI connector is HDMI 2.0 revision b which enables:
- Transmission of High Dynamic Range (HDR) video
- Bandwidth up to 18 Gbps
- 4K@50/60 (2160p), which is 4 times the clarity of 1080p/60 video resolution
- Up to 32 audio channels for a multi-dimensional immersive audio experience
DisplayPort wise compatibility has shifted upwards to DP 1.4 which provides 8.1 Gbps of bandwidth per lane and offers better color support using Display Stream Compression (DSC), a "visually lossless" form of compression that VESA says "enables up to 3:1 compression ratio." DisplayPort 1.4 can drive 60 Hz 8K displays and 120 Hz 4K displays with HDR "deep color." DP 1.4 also supports:
- Forward Error Correction: FEC, which overlays the DSC 1.2 transport, addresses the transport error resiliency needed for compressed video transport to external displays.
- HDR meta transport: HDR meta transport uses the “secondary data packet” transport inherent in the DisplayPort standard to provide support for the current CTA 861.3 standard, which is useful for DP to HDMI 2.0a protocol conversion, among other examples. It also offers a flexible metadata packet transport to support future dynamic HDR standards.
- Expanded audio transport: This spec extension covers capabilities such as 32 audio channels, 1536kHz sample rate, and inclusion of all known audio formats.
AMD Polaris can fully support HDR and deep color all the way. HDR is becoming a big thing, especially for the movie aficionados. Think better pixels, a wider color space, more contrast and more interesting content on that screen of yours. We've seen some demos on HDR screens, and it is pretty darn impressive to be honest. By this year you will see the first HDR compatible Ultra HD TVs, and then next year likely monitors and games supporting it properly. HDR is the buzz-word for 2016. With Ultra HD Blu-ray just being released in Q1 2016 there will be a much welcomed feature, HDR. HDR will increase the strength of light in terms of brightness. High-dynamic-range rendering (HDRR or HDR rendering), also known as high-dynamic-range lighting, is the rendering of computer graphics scenes by using lighting calculations done in a larger dynamic range. This allows preservation of details that may be lost due to limiting contrast ratios. Video games and computer-generated movies and special effects benefit from this as it creates more realistic scenes than with the more simplistic lighting models used.
With HDR you should remember three things: bright things can be really bright, dark things can be really dark and details can be seen in both. High-dynamic-range will reproduce a greater dynamic range of luminosity than is possible with digital imaging. We measure this in Nits, and the number of Nits for UHD screens and monitors is going up. What's a Nit? Candle brightness measured over one meter is 1 nits, the sun is 16,000,000,000 nits, typical objects have 1~250 nits, current PC displays have 1 to 250 nits, and excellent HDTVs have 350 to 400 nits. A HDR OLED screen is capable of 500 nits and here it’ll get more important, new screens in 2016 will go to 1,000 nits. HDR allows high nits values to be used. We think HDR will be implemented in 2016 for PC gaming, Hollywood has already got end-to-end access content ready of course. As consumers start to demand higher-quality monitors, HDR technology is emerging to set an excitingly high bar for overall display quality. HDR panels are characterized by: Brightness between 600-1200 cd/m2 of luminance, with an industry goal to reach 2,000 contrast ratios that closely mirror human visual sensitivity to contrast (SMPTE 2084). The Rec.2020 color gamut that can produce over 1 billion colors at 10 bits per color HDR can represent a greater range of luminance levels than can be achieved using more "traditional" methods, such as many real-world scenes containing very bright direct sunlight to extreme shade or very faint nebulae. HDR displays can be designed with the deep black depth of OLED (black is zero, the pixel is disabled), or the vivid brightness of local dimming LCD. Now meanwhile, if you cannot wait to play games in HDR and did purchase a HDR HDTV this year, you could stream it.
A selection of Ultra HDTVs are already available, and consumer monitors are expected to reach the market late 2016 and 2017, also the future will entail HDR based games. Such displays will offer unrivaled color accuracy, saturation, brightness, and black depth - in short, they will come very close to simulating the real world.