Pascal GPU Architecture
The New Pascal Based GPUs
The GeForce GTX 1000 series graphics cards are based on the latest iteration of GPU architecture called Pascal (named after the famous mathematician), the Titan Xp uses revision A1 of GP102.
- Pascal Architecture - The Nvidia Titan Xp Pascal architecture is the most powerful GPU design ever built. Comprised of 12 billion transistors and including 3840 single-precision CUDA cores, the card is the world's fastest consumer GPU.
- 16nm FinFET - The GPU is fabricated using a 16nm FinFET manufacturing process that allows the chip to be built with more transistors, ultimately enabling new GPU features, higher performance, and improved power efficiency.
- GDDR5X Memory - GDDR5X provides a significant memory bandwidth improvement over the GDDR5 memory that was used previously in NVIDIA's flagship GeForce GTX GPUs. Running at a data rate of 11.4 Gbps, the Titan Xps 384-bit memory interface provides way more memory bandwidth than NVIDIA's prior GeForce GTX 980 GPU. Combined with architectural improvements in memory compression, the total effective memory bandwidth increase compared to GTX 980 is 1.8x.
The rectangular die of the GP102 was measured at close to 471 mm² in a BGA package which houses a transistor-count of well over 12 billion. Pascal GPUs are fabbed by the Taiwan Semiconductor Manufacturing Company (TSMC) in the 16nm node.
NVIDIA Titan Xp
Alright, we are stepping back to reference material for a second here. The Nvidia Titan X gets a shader processor count of 3,584 shader processors. This product is pretty slick as it can manage clock frequencies that are really high, whilst sticking to a 250 Watt TDP. The Titan Xp is the card that comes fitted with fast pace 11.4 Gbps GDDR5X-memory, and sure, a proper 12 GB of it as well. The reference cards have a base-clock of 1.4 GHz with a boost clock of 1.6 GHz.
The Titan Xp is capable of achieving 12 TFLOP/sec - A GTX 1060 does 4.6 TFLOP/sec of Single Precision performance. To compare it a little, a reference design GeForce GTX 980 pushes 4.6 TFLOPS and a 980 Ti can push 6.4 TFLOPS. The change in shader amount is among the biggest differentials together with ROP, TMU count, and memory tied to it. The product is obviously PCI-Express 3.0 compatible, it has a max TDP of around 250 Watts with a typical idle power draw of 5 to 10 Watts. That TDP is a maximum overall, and on average your GPU will not consume that amount of power. So during gaming that average will be lower. The Founders Edition cards run cool and silent enough. Pascal GPUs are fabbed by the Taiwan Semiconductor Manufacturing Company (TSMC) at 16nm FinFET. You will have noticed the two memory types used in the 1060/1070/1080 range already which can be a bit confusing. What was interesting to see was another development, slowly but steadily graphics card manufacturers want to move to HBM memory, stacked High Bandwidth Memory that they can place on-die (close to the GPU die). HBM revision 1, however, is limited to four stacks of 1 GB, thus if used you'd only see 4 GB graphics cards. HBM2 can go to 8 GB and 16 GB, however that production process is just not yet ready and/or affordable enough for volume production.
So the Titan Xp is tied to GDDR5X DRAM memory. You can look at GDDR5X memory chips as your normal GDDR5 memory, however, opposed to delivering 32 byte/access to the memory cells, this is doubled up towards 64 byte/access. And that, in theory, could double up graphics card memory bandwidth, Pascal certainly likes large quantities of memory bandwidth to do its thing in. Nvidia states it to be 384-bit GDDR5X @ 11.4 Gbps (which is an effective data-rate).
Display Connectivity
Nvidia's Pascal generation products will receive a nice upgrade in terms of monitor connectivity. First off, the cards will get three DisplayPort connectors, one HDMI connector and a DVI connector. The days of Ultra High-resolution displays are here, Nvidia is adapting to it. The HDMI connector is HDMI 2.0 revision b which enables:
- Transmission of High Dynamic Range (HDR) video
- Bandwidth up to 18 Gbps
- 4K@50/60 (2160p), which is 4 times the clarity of 1080p/60 video resolution
- Up to 32 audio channels for a multi-dimensional immersive audio experience
DisplayPort wise compatibility has shifted upwards to DP 1.4 which provides 8.1 Gbps of bandwidth per lane and offers better color support using Display Stream Compression (DSC), a "visually lossless" form of compression that VESA says "enables up to 3:1 compression ratio." DisplayPort 1.4 can drive 60 Hz 8K displays and 120 Hz 4K displays with HDR "deep color." DP 1.4 also supports:
- Forward Error Correction: FEC, which overlays the DSC 1.2 transport, addresses the transport error resiliency needed for compressed video transport to external displays.
- HDR meta transport: HDR meta transport uses the “secondary data packet” transport inherent in the DisplayPort standard to provide support for the current CTA 861.3 standard, which is useful for DP to HDMI 2.0a protocol conversion, among other examples. It also offers a flexible metadata packet transport to support future dynamic HDR standards.
- Expanded audio transport: This spec extension covers capabilities such as 32 audio channels, 1536kHz sample rate, and the inclusion of all known audio formats.