Pascal GPU Architecture
The Pascal Based GPUs
The GeForce GTX 1070 and 1080 graphics cards are based on the latest iteration of GPU architecture called Pascal (named after the famous mathematician), the cards use revision A1 of GP104. Both cards will have slightly different configurations though.
- Pascal Architecture - The GeForce GTX 1080's Pascal architecture is the most efficient GPU design ever built. Comprised of 7.2 billion transistors and including 2,560 single-precision CUDA Cores, the GeForce GTX 1080 is the world's fastest GPU. With an intense focus on craftsmanship in chip and board design, NVIDIA's engineering team achieved unprecedented results in frequency of operation and energy efficiency.
- 16 nm FinFET - The GeForce GTX 1080's GP104 GPU is fabricated using a new 16 nm FinFET manufacturing process that allows the chip to be built with more transistors, ultimately enabling new GPU features, higher performance, and improved power efficiency.
- GDDR5X Memory - GDDR5X provides a significant memory bandwidth improvement over the GDDR5 memory that was used previously in NVIDIA's flagship GeForce GTX GPUs. Running at a data rate of 10 Gbps, the GeForce GTX 1080's 256-bit memory interface provides 43% more memory bandwidth than NVIDIA's prior GeForce GTX 980 GPU. Combined with architectural improvements in memory compression, the total effective memory bandwidth increase compared to GTX 980 is 1.7x.
The rectangular die of the GP104 was measured at close to 15.35 mm x 19.18 mm with a 37.5 × 37.5 mm 314 mm² BGA package which houses a transistor-count of well over 7 billion. Pascal GPUs are fabbed by the Taiwan Semiconductor Manufacturing Company (TSMC).
GeForce GTX 1080
The GeForce GTX 1080; the rumors have been correct, including a correct shader processor count of 2,560 shader processors for the 1080. This product is pretty wicked as it can manage clock frequencies that are really high, whilst sticking to a 180 Watt TDP only. The Nvidia GeForce GTX 1080 is based on a GP104-A1 GPU which holds 7.2 Billion transistors (FinFET). The GeForce GTX 1080 is the card that comes fitted with Micron's new GDDR5X-memory, a proper 8 GB of it. The card has a base-clock of 1.61 GHz with a boost clock of 1.73 GHz, and that's just the Founders/reference design. This edition is capable of achieving 9 TFLOP/sec of single precision performance. To compare it a little, a reference design GeForce GTX 980 pushes 4.6 TFLOPS and a 980 Ti can push 6.4 TFLOPS. Overclocks over 2 GHz on the boost frequency are possible. In fact, just at default clocks (depending on load and title) we've seen ~1,850 MHz clock frequencies. The memory base frequency is 2,500 MHz on a 256-bit wide memory bus, but being GDDR5X that reads as 5,000 MHz, and double it up (double data-rate) that means an affective data-rate of 10 Gbps - and thus an effective speed of 10 GHz. Nvidia will sell its reference card (named 'Founders Edition') at 699 USD, the board partner cards will start at 599 USD for the most simple models.
The GeForce GTX 1070 has a very similar GP104 GPU based on Pascal-architecture, the product is the more affordable one. This card comes with "regular" GDDR-memory, again, regular GDDR5 memory so that effective bandwidth is cut in half. The product also uses a GP104 Pascal GPU, but has a smaller number of shader processors that are active. In the Founders Edition configuration (the reference design) it would still offer 6.5 TFLOPS of performance (single precision).
GeForce GTX 1080 comes with 2,560 shader (CUDA) cores while its little brother, the GeForce GTX 1070, is expected to get 2,048 shader processors at its disposal (this remains speculation until the final numbers arrive though). The change in shader amount is among the biggest differentials together with ROP, TMU count and memory tied to it.
- GeForce GTX 970 has 1,664 shader processors and 4 GB of GDDR5 graphics memory.
- GeForce GTX 980 has 2,048 shader processors and 4 GB of GDDR5 graphics memory.
- GeForce GTX 1070 has 1,920 shader processors and 8 GB of GDDR5 graphics memory.
- GeForce GTX 1080 has 2,560 shader processors and 8 GB of GDDR5X graphics memory.
The product is obviously PCI-Express 3.0 compatible, it has a max TDP of around 180 Watts with a typical idle power draw of 5 to 10 Watts. That TDP is a maximum overall, and on average your GPU will not consume that amount of power. So during gaming that average will be lower. Both Founders Edition cards run cool and silent enough.
Using Both GDDR5 & GDDR5X
You will have noticed the two memory types used already. What was interesting to see was another development, slowly but steadily graphics card manufacturers want to move to HBM memory, stacked High Bandwidth Memory that they can place on-die (close to the GPU die). HBM revision 1 however is limited to four stacks of 1 GB, thus if used you'd only see 4 GB graphics cards. HBM2 can go 8 GB and 16 GB, however that production process is just not yet ready and/or affordable enough for volume production. With HBM2 being an expensive and limited one it’s simply not the right time to make the move; Big Pascal whenever it releases to the consumer in, say, some sort of Titan or Ti edition will get HBM2 memory, 16 GB of it separated over 4 stacks. But we do not see Big Pascal (the Ti or Titan equivalent for Pascal) launching anytime sooner than Christmas or even Q1 of 2017. So with HBM/HBM2 out of the running, basically there are three solutions left, go with traditional GDDR5 memory or make use of GDDR5X, let’s call that turbo GDDR5. Nvidia in fact opted for both, the GeForce GTX 1070 is to be fitted with your "regular" GDDR5 memory. But to get the GTX 1080 a little extra bite in bandwidth they will fit it with Micron's all new GDDR5X memory. So yes, the GP104 GPU can be tied to both memory types. The 1080 tied to GDDR5X DRAM memory is rather interesting. You can look at GDDR5X memory chips as your normal GDDR5 memory however, opposed to delivering 32 byte/access to the memory cells, this is doubled up towards 64 byte/access. And that in theory could double up graphics card memory bandwidth, Pascal certainly likes large quantities of memory bandwidth to do its thing in. Nvidia states it to be 256-bit GDDR5X @ 10 Gbps (which is an effective data-rate).
Nvidia's Pascal generation products will receive a nice upgrade in terms of monitor connectivity. First off, the cards will get three DisplayPort connectors, one HDMI connector and a DVI connector. The days of Ultra High resolution displays are here, Nvidia is adapting to it. The HDMI connector is HDMI 2.0 revision b which enables:
- Transmission of High Dynamic Range (HDR) video
- Bandwidth up to 18 Gbps
- 4K@50/60 (2160p), which is 4 times the clarity of 1080p/60 video resolution
- Up to 32 audio channels for a multi-dimensional immersive audio experience
DisplayPort wise compatibility has shifted upwards to DP 1.4 which provides 8.1 Gbps of bandwidth per lane and offers better color support using Display Stream Compression (DSC), a "visually lossless" form of compression that VESA says "enables up to 3:1 compression ratio." DisplayPort 1.4 can drive 60 Hz 8K displays and 120 Hz 4K displays with HDR "deep color." DP 1.4 also supports:
- Forward Error Correction: FEC, which overlays the DSC 1.2 transport, addresses the transport error resiliency needed for compressed video transport to external displays.
- HDR meta transport: HDR meta transport uses the “secondary data packet” transport inherent in the DisplayPort standard to provide support for the current CTA 861.3 standard, which is useful for DP to HDMI 2.0a protocol conversion, among other examples. It also offers a flexible metadata packet transport to support future dynamic HDR standards.
- Expanded audio transport: This spec extension covers capabilities such as 32 audio channels, 1536kHz sample rate, and inclusion of all known audio formats.
High Dynamic Range (HDR) Display Compatibility
Nvidia obviously can now fully support HDR and deep color all the way. HDR is becoming a big thing, especially for the movie aficionados. Think better pixels, a wider color space, more contrast and more interesting content on that screen of yours. We've seen some demos on HDR screens, and it is pretty darn impressive to be honest. By this year you will see the first HDR compatible Ultra HD TVs, and then next year likely monitors and games supporting it properly. HDR is the buzz-word for 2016. With Ultra HD Blu-ray just being released in Q1 2016 there will be a much welcomed feature, HDR. HDR will increase the strength of light in terms of brightness. High-dynamic-range rendering (HDRR or HDR rendering), also known as high-dynamic-range lighting, is the rendering of computer graphics scenes by using lighting calculations done in a larger dynamic range. This allows preservation of details that may be lost due to limiting contrast ratios. Video games and computer-generated movies and special effects benefit from this as it creates more realistic scenes than with the more simplistic lighting models used. With HDR you should remember three things: bright things can be really bright, dark things can be really dark, and details can be seen in both. High-dynamic-range will reproduce a greater dynamic range of luminosity than is possible with digital imaging. We measure this in Nits, and the number of Nits for UHD screens and monitors is going up. What's a nit? Candle brightness measured over one meter is 1 nits, the sun is 1.6000.000.000 nits, typical objects have 1~250 nits, current PC displays have 1 to 250 nits, and excellent HDTVs have 350 to 400 nits. A HDR OLED screen is capable of 500 nits and here it’ll get more important, new screens in 2016 will go to 1,000 nits. HDR offers high nits values to be used. We think HDR will be implemented in 2017 for PC gaming, Hollywood has already got end-to-end access content ready of course. As consumers start to demand higher-quality monitors, HDR technology is emerging to set an excitingly high bar for overall display quality. HDR panels are characterized by: Brightness between 600-1200 cd/m2 of luminance, with an industry goal to reach 2,000 contrast ratios that closely mirror human visual sensitivity to contrast (SMPTE 2084) And the Rec.2020 color gamut that can produce over 1 billion colors at 10 bits per color HDR can represent a greater range of luminance levels than can be achieved using more "traditional" methods, such as many real-world scenes containing very bright, direct sunlight to extreme shade, or very faint nebulae. HDR displays can be designed with the deep black depth of OLED (black is zero, the pixel is disabled), or the vivid brightness of local dimming LCD. Now meanwhile, if you cannot wait to play games in HDR and did purchase a HDR HDTV this year, you could stream it. A HDR game rendered on your PC with a Pascal GPU can be streamed towards your Nvidia Shield Android TV and then over HDMI connect to that HDR telly as Pascal has support for 10 bit HEVC HDR encoding and the Shield Android TV can decode it. Hey, just sayin'. A selection of Ultra HDTVs are already available, and consumer monitors are expected to reach the market in 2017. Such displays will offer unrivaled color accuracy, saturation, brightness, and black depth - in short, they will come very close to simulating the real world.