AMD Greenland Vega10 Silicon To Have 4096 Stream Processors?
Some interesting info surfaced on the web the past day or so. As you guys know, AMD is to release GPUs based on Polaris, however a recent AMD roadmap shows Vega and Navi Architectures. Vega is the successor to Polaris with HBM2 memory to be launched in the 2017 time-frame. An employee on their linked-in page noted Vega to have 4096 shader processors.
It seems that after Polaris in 2017 VEGA will make an appearance, the name is tagged with HBM2 meaning that HBM2 likely will not make it onto Polaris. Vega is the brightest star in the constellation Lyra, the fifth brightest star in the night sky and the second brightest star in the northern celestial hemisphere, after Arcturus. Next in line in the 2018 timeframe we see Navi with the two keywords being scalability and Nextgen memory.
Now the juicy part, The LinkedIn profile of Yu Zheng (who is an R&D manager at AMD) (info was removed btw) shows "shader processor" (stream processor) count of Vega10 to be 4,096, have a peek:
Interestingly enough, that's the same shader processor number as Fiji (Radeon R9 Fury X). Since the info does not state what model GPU (mid-range/high-end/enthusiast) it'll be used for it all remains guess work in terms of performance. We do know that perf per watt will increase significantly over Polaris and that these GPUs will be fitted with HMB2 graphics memory.
Since we are on the topic, "Polaris" architecture based names are oozing out. A high-end chip will be called "Ellesmere" or Polaris10. There will be a mid-range GPU called "Baffin" or Polaris11. And then "Ellesmere" is rumored to get 36 GCN 4.0 compute units, which works out to 2,304 stream processors; and a 256-bit wide bus, indicative for GDDR5/GDDR5X with 8 GB memory amount.
It's going to be an interesting year.
Senior Member
Posts: 11808
Joined: 2012-07-20
Transistor ending in uncertain state is consequence of Vdrop (inability to deliver enough power to that particular transistor/too low circuit resistance while power supply has too high internal resistance).
980Ti GPU sucks less current than Fiji, because Fiji GPU gets all that power consumption which HBM1 saved in comparison to having 16 GDDR5 chips (512bit bus like r9-290).
Maybe having Vcc delivered at few more places in GPU would allow for higher stability, but on other hand maybe internal PSU resistance and internal VRMs resistance would need to improve a bit too (like close to 0 ohms).
Senior Member
Posts: 3490
Joined: 2007-01-27
Transistor ending in uncertain state is consequence of Vdrop (inability to deliver enough power to that particular transistor/too low circuit resistance while power supply has too high internal resistance).
980Ti GPU sucks less current than Fiji, because Fiji GPU gets all that power consumption which HBM1 saved in comparison to having 16 GDDR5 chips (512bit bus like r9-290).
Maybe having Vcc delivered at few more places in GPU would allow for higher stability, but on other hand maybe internal PSU resistance and internal VRMs resistance would need to improve a bit too (like close to 0 ohms).
One of the big advantages of scaling transistors down is the voltage required for switching is lower, at 14nm you can pretty much start counting individual electrons, the time taken for the transistor to switch also decreases due to similar effects, anyway ; my point about Fiji is... Well you basically repeated it, Fiji core is getting more power than gm200 by quite a big margin; gddr5 and the imc taking up a big chunk of board power, yet 980ti (even overclocked) consumes less and performs better while running on air. I also wanted to ask, are those 8.9 billion transistors on Fiji accounting for hbm?
Senior Member
Posts: 11808
Joined: 2012-07-20
No, HBM is separate. And that Vdrop which does not allow Fiji to clock that high is as you wrote... More transistor seep more power for normal operation. Then higher density worsens leakage.
That lower transistor count and lower density for GM200 is nice advantage. And you are absolutely right I now feel like broken record

Senior Member
Posts: 17851
Joined: 2012-05-18
So this topic went all south.. who cares how much power will it consume.. all im "interested" is in performance.. and that looks like it wont be anything special since hbm2.0 is now reserved for Vega.
Posts: 3490
Joined: 2007-01-27
GTX 680 3.54 294 12.041
HD 7970 4.313 352 12.253
GTX 780Ti 7.08 561 12.62
r9-290x 6.2 438 14.155
980 5.2 398 13.065
Fury X 8.9 596 14.933
980 Ti 8.0 601 13.311
Pretty much sums it up. Both companies increased transistor density. Not because they invented magic trick, but because TSMC improved their process and reduced leakage at higher density.
But at any given time, transistor made for AMD was exactly same transistor as TSMC made for nVidia. With same leakage to density ratio. With same operational voltage. And with same Vdrop caused by leakage. And with same range at which transistor voltage is considered as 1 or 0 (simplified). And therefore with same voltage range at which transistor is in undetermined state. And same time required to get from 0 to 1 state with given leakage and voltage.
And this time which decides minimum period required for stable operation.
I hope that this helps in understanding how transistor density affects maximum clock.
But maybe someone is right in thinking that Fury X transistor density could have been used in December 2011 for HD 7970. And therefore HD 7970 could have had whooping 22% higher transistor density and lower cost.
And maybe someone can come with brilliant idea that TSMC makes different kind of transistors for nVidia which simply clock higher than those they have for AMD regardless of physics.
(But I am happy we do not have people like that here.)
This is all fine and nobody was contesting this, you're limited by the switching frequency of the transistors once you're past all the other considerations, still the increased density in Fiji vs Maxwell doesn't justify the lower clock, especially considering power overhead maxwell carries using gddr vs hbm