Nvidia Announces PCI-Express version of Tesla V100 accelerator

Published by

teaser

Nvidia announced a PCI Expres version 'card' of its the Tesla V100 accelerator, to be releases later this year. The unit has 16GB HBM2 memory and the Volta GPU has been fitted with 5120 sharder processors.  .



The PCI-E card of the Tesla V100 is intended to be a server part for stuff like deep-learning, research and analysis. The card is clocked slightly slower compared to the original module model and comes with a TDP of 250 watts. Nvidia claims these cards will be releases 'later this year'. 

Specifications of the PCIe form factor include:

  • 7 teraflops double-precision performance, 14 teraflops single-precision performance and 112 teraflops half-precision performance with NVIDIA GPU BOOST™ technology
  • 16GB of CoWoS HBM2 stacked memory, delivering 900GB/sec of memory bandwidth
  • Support for PCIe Gen 3 interconnect (up to 32GB/sec bi-directional bandwidth)
  • 250 watts of power
  Tesla V100 
(SXM2)
Tesla V100 
(PCIe)
Tesla P100 
(SXM2)
Tesla P100 
(PCIe)
Architecture Volta Volta Pascal Pascal
Gpu GV100 (815mm2) GV100 (815mm2) GP100 (610mm2) GP100 (610mm2)
Shader cores 5120 5120 3584 3584
Tensor cores 640 640 AFTER AFTER
Core Speed ? ? 1328MHz ?
Boost Clock 1455MHz ~ 1370MHz 1480MHz 1300MHz
Memory Speed 1.75Gbps HBM2 1.75Gbps HBM2 1.4Gbps HBM2 1.4Gbps HBM2
Memory 4096-bit 4096-bit 4096-bit 4096-bit
memory Bandwidth 900GB / sec 900GB / sec 720GB / sec 720GB / sec
Vram 16GB 16GB 16GB 16GB
L2 cache 6MB 6MB 4MB 4MB
Half Precision 30 TFLOPS 28 TFLOPS 21.2 TFLOPS 18.7 TFLOPS
Single Precision 15 TFLOPS 14 TFLOPS 10.6 TFLOPS 9.3 TFLOPS
Double Precision 7.5 TFLOPS 
(1/2 rate)
7 TFLOPS 
(1/2 rate)
5.3 TFLOPS 
(half rate)
4.7 TFLOPS 
(1/2 rate)
Tensor Performance 
(Deep Learning)
120 TFLOPS 112 TFLOPS AFTER AFTER
Transistors 21 billion 21 billion 15.3 billion 15.3 billion
TDP 300W 250W 300W 250W
Form Factor Mezzanine (SXM2) PCIe Mezzanine (SXM2) PCIe
Process TSMC 12nm FFN TSMC 12nm FFN TSMC 16nm FinFET TSMC 16nm FinFET

  
One of the biggest innovations of the V100 compared to the P100 are the all new Tensor Cores, the GV100 GPU has 640 them: eight per sm. Nvidia claims huge performance gains for applications that can make use of it. At regular fp32- and fp64 calculations the GV100 is about 1.5 times as fast as the GP100. NVIDIA Tesla V100 GPU accelerators for PCIe-based systems are expected to be available later this year from NVIDIA reseller partner and manufacturers, including Hewlett Packard Enterprise (HPE).

Nvidia Announces PCI-Express version of Tesla V100 accelerator


Share this content
Twitter Facebook Reddit WhatsApp Email Print