Nvidia Announces PCI-Express version of Tesla V100 accelerator

#5445078

Spets

2017-06-21 09:44

I can't wait for a GeForce variant 😀

#5445087

Stormyandcold

2017-06-21 10:50

Nice! •7 teraflops double-precision performance, 14 teraflops single-precision performance and 112 teraflops half-precision performance with NVIDIA GPU BOOST™ technology •16GB of CoWoS HBM2 stacked memory, delivering 900GB/sec of memory bandwidth •Support for PCIe Gen 3 interconnect (up to 32GB/sec bi-directional bandwidth) •250 watts of power 112 Tflops of 16bit performance? Really Nvidia? That can't be real!?!

#5445101

SirDremor

2017-06-21 11:28

Nvidia continues its mega streak after the beautifully managed Pascal lineup! Can't wait for Volta GeForce!

#5445102

Exascale

2017-06-21 11:32

I can't wait for a GeForce variant 😀

There wont be one. There will be a Quadro V100 like the Quadro P100. It would play games too.

#5445104

Spets

2017-06-21 11:36

112 Tflops of 16bit performance? Really Nvidia? That can't be real!?!

Tensor cores.

There wont be one. There will be a Quadro V100 like the Quadro P100. It would play games too.

Okay.. GV102 then. Think we know what I meant 😉

#5445170

Agent-A01

2017-06-21 15:52

I can't wait for a GeForce variant 😀

If it can clock to 2ghz like pascal it will be ~21 tflops 🤓

#5445186

Truder

2017-06-21 16:47

112 Tflops of 16bit performance? Really Nvidia? That can't be real!?!

Yeah that's the ASIC "Tensor core" throughput - so that's for specific workloads to do with machine learning - it's a bit of a bullsh*ty marketing technique to make it sound way better then it really is but then again, for the target audience, that's all that's needed. The actual FP16 throughput is 28 TFLOPS

#5445191

Loophole35

2017-06-21 17:01

Yeah that's the ASIC "Tensor core" throughput - so that's for specific workloads to do with machine learning - it's a bit of a bullsh*ty marketing technique to make it sound way better then it really is but then again, for the target audience, that's all that's needed. The actual FP16 throughput is 28 TFLOPS

If it actually works then it's not a BS marketing ploy. Wonder how your post would have been worded had this been a Vega FE news article?

#5445192

Truder

2017-06-21 17:05

If it actually works then it's not a BS marketing ploy. Wonder how your post would have been worded had this been a Vega FE news article?

Are you trying to suggest I'm being biased? Come on Loophole don't start anything like that.... It's just the wording that isn't really accurate as such, it doesn't have 16bit 112 TFLOPS for everything, just for application specific workflows- Tensor - but that's what it's built for, machine learning.

#5445197

Loophole35

2017-06-21 17:11

Are you trying to suggest I'm being biased? Come on Loophole don't start anything like that.... It's just the wording that isn't really accurate as such, it doesn't have 16bit 112 TFLOPS for everything, just for application specific workflows- Tensor - but that's what it's built for, machine learning.

Just read the article a bit closer the bullet point is wrong anyway the 112 TFlop is for the tensor core only. Still calling it marketing BS is salty AF. Isn't AMD trying to break into deep learning now too. Wonder if with their hopefully increased revenue from Ryzen and the fact that Vega seems to be a hit with OEM's if they will do something similar with their version of a tensor core?

If it can clock to 2ghz like pascal it will be ~21 tflops 🤓

I'm more interested to see power draw and die size on the GeForce version.

#5445204

Truder

2017-06-21 17:22

Just read the article a bit closer the bullet point is wrong anyway the 112 TFlop is for the tensor core only. Still calling it marketing BS is salty AF. Isn't AMD trying to break into deep learning now too. Wonder if with their hopefully increased revenue from Ryzen and the fact that Vega seems to be a hit with OEM's if they will do something similar with their version of a tensor core? I'm more interested to see power draw and die size on the GeForce version.

I just don't like any form of mis-representation, the bullet point should say that tensor core throughput is 112 TFLOPS, it looks like it's describing general compute performance. Kinda like Broadband ISPs advertising upto (for the sake of argument) 100mbit but the actual speed is only 10% of that because of limitations that are detailed in small print. (Unless it's the article/news source that's written wrong?). The big businesses investing in this technology know exactly what they're getting though so it's not really an issue, this technology isn't for a general consumer pleb like me and others around here.

#5445544

KingK76

2017-06-22 19:46

If it can clock to 2ghz like pascal it will be ~21 tflops 🤓

Hilarious! I did the exact same thing you must have done after reading this article. I came up with the same 21Tf figure you did... The cool thing is though that at 12nm the Geforce variant (GV102) may clock even higher then 21MHz... especially considering that the Geforce catd won't have to waste power on Tensor cores...