NVIDIA talks about Pascal - Will be fast and has 3D Stacked Memory

Published by

teaser

Over at GTC, NVIDIA CEO Jen-Hsun Huang forcasted that the future 16nm Pascal GPU will be up to ten times faster than Maxwel in terms of deep learning performance. Pascal will offer up to 32GB of 3D-stacked memory, mixed-precision computing support and NVIDIA's NVLink high-speed interconnect when it launches in 2016.



NVIDIA’s Pascal GPU architecture, set to debut next year, will accelerate deep learning applications 10X beyond the speed of its current-generation Maxwell processors. So read that well, not 10x faster, but 10x faster deep learning applications. And hey, then there is the memory, up to 32GB of 3D-stacked memory (HBM) with oh say .. 750GB/s of bandwidth.

NVIDIA CEO and co-founder Jen-Hsun Huang revealed details of Pascal and the company’s updated processor roadmap in front of a crowd of 4,000 during his keynote address at the GPU Technology Conference, in Silicon Valley.

“It will benefit from a billion dollars worth of refinement because of R&D done over the last three years,” he told the audience.

The rise of deep learning – the process by which computers use neural networks to teach themselves – led NVIDIA to evolve the design of Pascal, which was originally announced at last year’s GTC. Pascal GPUs will have three key design features that will result in dramatically faster, more accurate training of richer deep neural networks – the human cortex-like data structures that serve as the foundation of deep learning research. Along with up to 32GB of memory — 2.7X more than the newly launched NVIDIA flagship, the GeForce GTX TITAN X — Pascal will feature mixed-precision computing. It will have 3D memory, resulting in up to 5X improvement in deep learning applications.  And it will feature NVLink – NVIDIA’s high-speed interconnect, which links together two or more GPUs — that will lead to a total 10X improvement in deep learning.


 

Mixed-Precision Computing – for Greater Accuracy
Mixed-precision computing enables Pascal architecture-based GPUs to compute at 16-bit floating point accuracy at twice the rate of 32-bit floating point accuracy. Increased floating point performance particularly benefits classification and convolution – two key activities in deep learning – while achieving needed accuracy.

3D Memory – for Faster Communication Speed and Power Efficiency
Memory bandwidth constraints limit the speed at which data can be delivered to the GPU. The introduction of 3D memory will provide 3X the bandwidth and nearly 3X the frame buffer capacity of Maxwell. This will let developers build even larger neural networks and accelerate the bandwidth-intensive portions of deep learning training. Pascal will have its memory chips stacked on top of each other, and placed adjacent to the GPU, rather than further down the processor boards. This reduces from inches to millimeters the distance that bits need to travel as they traverse from memory to GPU and back. The result is dramatically accelerated communication and improved power efficiency.

NVLink – for Faster Data Movement
The addition of NVLink to Pascal will let data move between GPUs and CPUs five to 12 times faster than they can with today’s current standard, PCI-Express. This is greatly benefits applications, such as deep learning, that have high inter-GPU communication needs.

NVLink allows for double the number of GPUs in a system to work together in deep learning computations. In addition, CPUs and GPUs can connect in new ways to enable more flexibility and energy efficiency in server design compared to PCI-E.

NVIDIA talks about Pascal - Will be fast and has 3D Stacked Memory


Share this content
Twitter Facebook Reddit WhatsApp Email Print