AMD Greenland Vega10 Silicon To Have 4096 Stream Processors?
Click here to post a comment for AMD Greenland Vega10 Silicon To Have 4096 Stream Processors? on our message forum
Fox2232
This year we do not need more SP/ROPs/TMUs/... as 14/16nm is to bring higher clock.
Question is how much higher. For AMD sitting around 950~1050MHz for standard and 1100~1200MHz for OC, it may be quite a jump up.
For nVidia, who likes to write on box low MHz number like 1040~1070MHz, but having GPUs boost to 1350~1450MHz without user OC, This Jump may be just small step up.
Noisiv
Fox2232
Fox2232
Noisiv
Fox2232
Noisiv
Fox2232
Ieldra
Few things need pointing out here, perf/w =/ perf. It doesn't scale linearly, a 250mm^2 can be the perf/w king, but that same gpu scaled up to 600mm^2 could be terribly inefficient .
As for the clocks, I'm confused frankly. AMD has historically gone for larger designs with more cores, while NV has gone with higher clocks.
1600 SPs at 1500MHz == 2400 SPs at 1000MHz, simple arithmetic guys.
My favorite performance metric would have to be Tflops/mm^2
Power consumption to clock speed relationship is most certainly not linear
I'm struggling to understand what you're saying. The nano is marketed as a power efficient part SPECIFICALLY because clock speed does not scale linearly with power draw, and it throttles heavily to remain with in its tdp.
Why would you ever level the clock playing field to compare two processors ? That makes no sense at all. I buy a GPU that runs at 50% higher clocks than it's competition, so you compare it to it's competitor at 66% of it's max ? That's just silly. Very, very silly.
Imagine we did that with the FX9590, compare it to haswell clock for clock - that'll be good for a laugh.
Fox2232
PrMinisterGR
http://i.imgur.com/dcHVdb7.gif[/spoiler]
AMD is probably investing in smaller chips this time around. They are going to do the Maxwell approach on GCN, is my guess. I hope that the good parts of the architecture are not lost in the translation.
In my uneducated opinion GCN on the hardware side is ROP/TMU limited. I wouldn't mind seeing a Fiji-like configuration with 128ROPs.
As for people whining about performance/watt and saying it's only for mobiles etc: If a chip is "cold" as a design you can clock it much higher than a "hot" chip, cram more hardware in, or a combination of both. Hawaii was a hot chip and it was a problem for AMD in that respect, until games started having more GCN-tuned engines and AMD themselves improved the drivers.
What I find curious is that supposedly the lower end parts will use GDDR5X. Now I might be wrong, but I have the feeling that ALL AMD parts will use HBM, and the lower end of the scale will go from that to GDDR5. Combined with the Micron news recently, and with that NVIDIA is the only company speaking about GDDR5X, I can't see AMD using it.
Also, if the memory bus is indeed 256bit for a 2.300+ shader product, I would be:
[spoiler]Fox2232
Ieldra
Noisiv
Fox2232
Here you have part of it.
Unfortunately not, they fit same amount of transistors into 3.7 times smaller area. And that same amount of transistors will eat 1.96 times more power at maximum clock in comparison to those 28nm transistors ticking at their maximum official clock.
We take your values (ignoring any errors of transistor density, power consumption per transistor per clock or voltage):
28nm: 1B transistors in 100mm^2 (@1GHz) consumes 10W
14nm: 1B transistors in 27mm^2 (@2.4GHz) consumes 19.6W
Now that is power efficient, 2.4x higher clock and only 1.96x higher power consumption. (22% more power efficient at peak clock, on any lower clock it is more power efficient.)
If you wanted those 4B transistors:
14nm: 4B transistors in 108mm^2 (@2.4GHz) consumes 78.4W
Bad thing on this die shrink, More heat is smaller area.
Good thing on this die shrink is that they promise high clock (I add: "As long as you can cool it.").
Ieldra
PrMinisterGR
Ieldra
waltc3
One "stream processor" does not equal another...;) Most likely the stream processors in Vega will be more capable than those in Fiji, so we are talking oranges and apples, here (not to mention talking unreleased products.)
The real challenge for me is how they are going to keep the processing pipes full enough to take advantage of the incredible bandwidth afforded by HBM. It doesn't matter how much memory bandwidth is available because if the GPU processing pipes cannot be constantly fed it will go unused.
Fox2232