Pascal GPU Architecture
The Pascal Based GPUs
The GeForce GTX 1070, 1070 Ti and 1080 graphics cards are based on the latest iteration of GPU architecture called Pascal (named after the famous mathematician), the cards use revision A1 of GP104. All cards will have slightly different configurations though. The rectangular die of the GP104 was measured at close to 15.35 mm x 19.18 mm with a 37.5 × 37.5 mm 314 mm² BGA package which houses a transistor-count of well over 7 billion. Pascal GPUs are fabbed by the Taiwan Semiconductor Manufacturing Company (TSMC).
GeForce GTX 1070 Ti
The GeForce GTX 1070 Ti; the rumors where been correct, including a correct shader processor count of 2,432 shader processors. This GPU can manage clock frequencies that are fairly high, whilst sticking to a 180 Watt TDP only. The Nvidia GeForce GTX 1070 Ti is based on a GP104-300-A1 GPU which holds 7.2 Billion transistors (FinFET). The GeForce GTX 1070 Ti comes fitted with GDDR5-memory, 8 GB of it. The card has a base-clock of 1.60 GHz with a boost clock of 1.68 GHz. This edition is capable of roughly 8 TFLOP/sec of single precision performance. To compare a generation backwards, a reference design GeForce GTX 980 would push 4.6 TFLOPS and a 980 Ti can push 6.4 TFLOPS. Overclocks at that 2 GHz on the boost frequency is also possible. In fact, just at default clocks (depending on load and title) we've seen ~1,850 MHz clock frequencies. The memory base frequency is 2,000 MHz on a 256-bit wide memory bus, but being GDDR5 that reads as 8,000 MHz (double-data-rate) that means an affective data-rate of 8 Gbps - and thus an effective speed of 8 GHz. It's now little brother GeForce GTX 1070 has a very similar GP104 GPU based on Pascal-architecture, the product is the more affordable one. This card comes with "regular" GDDR-memory, again, regular GDDR5 memory so that effective bandwidth is cut in half. The product also uses a GP104 Pascal GPU, but has a smaller number of shader processors that are active. In the Founders Edition configuration (the reference design) it would still offer 6.5 TFLOPS of performance (single precision). GeForce GTX 1080 comes with 2,560 shader (CUDA) cores while its little brother, the GeForce GTX 1070 has 2,048 shader processors.
- GeForce GTX 970 has 1,664 shader processors and 4 GB of GDDR5 graphics memory.
- GeForce GTX 980 has 2,048 shader processors and 4 GB of GDDR5 graphics memory.
- GeForce GTX 1070 has 1,920 shader processors and 8 GB of GDDR5 graphics memory.
- GeForce GTX 1070 Ti has 2,432 shader processors and 8 GB of GDDR5 graphics memory.
- GeForce GTX 1080 has 2,560 shader processors and 8 GB of GDDR5X graphics memory.
The product is obviously PCI-Express 3.0 compatible, it has a max TDP of around 180 Watts with a typical idle power draw of 5 to 10 Watts. That TDP is a maximum overall, and on average your GPU will not consume that amount of power. So during gaming that average will be lower. Both Founders Edition cards run cool and silent enough.
Using Both GDDR5 & GDDR5X
You will have noticed the two memory types used already. What was interesting to see was another development, slowly but steadily graphics card manufacturers want to move to HBM memory, stacked High Bandwidth Memory that they can place on-die (close to the GPU die). HBM revision 1, however, is limited to four stacks of 1 GB, thus if used you'd only see 4 GB graphics cards. HBM2 can go 8 GB and 16 GB, however that production process is just not yet ready and/or affordable enough for volume production. With HBM2 being an expensive and limited one it’s simply not the right time to make the move; Big Pascal, whenever it releases to the consumer in, say, some sort of Titan or Ti edition, will get HBM2 memory, 16 GB of it separated over 4 stacks. But we do not see Big Pascal (the Ti or Titan equivalent for Pascal) launching anytime sooner than Christmas or even Q1 of 2017. So with HBM/HBM2 out of the running, basically there are three solutions left, go with traditional GDDR5 memory or make use of GDDR5X, let’s call that turbo GDDR5. Nvidia, in fact, opted for both, the GeForce GTX 1070 is to be fitted with your "regular" GDDR5 memory. But to get the GTX 1080 a little extra bite in the bandwidth they will fit it with Micron's all new GDDR5X memory. So yes, the GP104 GPU can be tied to both memory types. The 1080 tied to GDDR5X DRAM memory is rather interesting. You can look at GDDR5X memory chips as your normal GDDR5 memory, however, opposed to delivering 32 byte/access to the memory cells, this is doubled up towards 64 byte/access. And that, in theory, could double up graphics card memory bandwidth, Pascal certainly likes large quantities of memory bandwidth to do its thing in. Nvidia states it to be 256-bit GDDR5X @ 10 Gbps (which is an effective data-rate).
Nvidia's Pascal generation products will receive a nice upgrade in terms of monitor connectivity. First off, the cards will get three DisplayPort connectors, one HDMI connector, and a DVI connector. The days of Ultra High-resolution displays are here, Nvidia is adapting to it. The HDMI connector is HDMI 2.0 revision b which enables:
- Transmission of High Dynamic Range (HDR) video
- Bandwidth up to 18 Gbps
- 4K@50/60 (2160p), which is 4 times the clarity of 1080p/60 video resolution
- Up to 32 audio channels for a multi-dimensional immersive audio experience
DisplayPort wise compatibility has shifted upwards to DP 1.4 which provides 8.1 Gbps of bandwidth per lane and offers better color support using Display Stream Compression (DSC), a "visually lossless" form of compression that VESA says "enables up to 3:1 compression ratio." DisplayPort 1.4 can drive 60 Hz 8K displays and 120 Hz 4K displays with HDR "deep color." DP 1.4 also supports:
- Forward Error Correction: FEC, which overlays the DSC 1.2 transport, addresses the transport error resiliency needed for compressed video transport to external displays.
- HDR meta transport: HDR meta transport uses the “secondary data packet” transport inherent in the DisplayPort standard to provide support for the current CTA 861.3 standard, which is useful for DP to HDMI 2.0a protocol conversion, among other examples. It also offers a flexible metadata packet transport to support future dynamic HDR standards.
- Expanded audio transport: This spec extension covers capabilities such as 32 audio channels, 1536kHz sample rate, and the inclusion of all known audio formats.