AMD Greenland Vega10 Silicon To Have 4096 Stream Processors?
Some interesting info surfaced on the web the past day or so. As you guys know, AMD is to release GPUs based on Polaris, however a recent AMD roadmap shows Vega and Navi Architectures. Vega is the successor to Polaris with HBM2 memory to be launched in the 2017 time-frame. An employee on their linked-in page noted Vega to have 4096 shader processors.
It seems that after Polaris in 2017 VEGA will make an appearance, the name is tagged with HBM2 meaning that HBM2 likely will not make it onto Polaris. Vega is the brightest star in the constellation Lyra, the fifth brightest star in the night sky and the second brightest star in the northern celestial hemisphere, after Arcturus. Next in line in the 2018 timeframe we see Navi with the two keywords being scalability and Nextgen memory.
Now the juicy part, The LinkedIn profile of Yu Zheng (who is an R&D manager at AMD) (info was removed btw) shows "shader processor" (stream processor) count of Vega10 to be 4,096, have a peek:
Interestingly enough, that's the same shader processor number as Fiji (Radeon R9 Fury X). Since the info does not state what model GPU (mid-range/high-end/enthusiast) it'll be used for it all remains guess work in terms of performance. We do know that perf per watt will increase significantly over Polaris and that these GPUs will be fitted with HMB2 graphics memory.
Since we are on the topic, "Polaris" architecture based names are oozing out. A high-end chip will be called "Ellesmere" or Polaris10. There will be a mid-range GPU called "Baffin" or Polaris11. And then "Ellesmere" is rumored to get 36 GCN 4.0 compute units, which works out to 2,304 stream processors; and a 256-bit wide bus, indicative for GDDR5/GDDR5X with 8 GB memory amount.
It's going to be an interesting year.
Senior Member
Posts: 8199
Joined: 2010-11-16
I have never excluded clock. And I do not intend to. nVidia smartly used lower transistor density on 28nm and reached higher clocks.
But this trick on 14/16nm will help only little. Because once over certain clock, cooling solution will not cope with heat made by GPU. And therefore no 25% higher clock for nVidia (or AMD) on 14/16nm.
You're saying that the company with decent advantage in perf/W will not be able to cope with the heat
(apparently there is some imaginary hard MHz wall; what does TSMC knows about 16FF+ anyway)
yet company whose products have rejuvenated Fermi African village monthly food electrical bill jokes,
the company that had to reach for water and HBM to have some resemblance of parity - will be peachy heat wise

Furthermore...
Using lower density <- that is a trick used by Nvidia to reach higher clocks?
Silly me, I always thought that traditionally higher transistor density on AMD products has been one of their advantages.
When in fact all they had to do is use lower trans. density and clock sky-high like Nvidia.
BTW Maxwell has bigger, not smaller transistor density than Kepler.
No hard feelings, but I'm done trying to talk sense into you. Maybe some other time. Cheers.
Senior Member
Posts: 3490
Joined: 2007-01-27
You're saying that the company with decent advantage in perf/W will not be able to cope with the heat
(apparently there is some imaginary hard MHz wall; what does TSMC knows about 16FF+ anyway)
yet company whose products have rejuvenated Fermi African village monthly food electrical bill jokes,
the company that had to reach for water and HBM to have some resemblance of parity - will be peachy heat wise

Furthermore...
Using lower density <- that is a trick used by Nvidia to reach higher clocks?
Silly me, I always thought that traditionally higher transistor density on AMD products has been one of their advantages.
When in fact all they had to do is use lower trans. density and clock sky-high like Nvidia.
BTW Maxwell has bigger, not smaller transistor density than Kepler.
No hard feelings, but I'm done trying to talk sense into you. Maybe some other time. Cheers.
Fiji has a higher transistor density than maxwell
I also don't think watercooling on Fiji is a boon, I hate aio, much rather air cooling or go all the way and invest in a proper liquid cooling kit
Senior Member
Posts: 8199
Joined: 2010-11-16
Fiji has a higher transistor density than maxwell
Point being that even with higher transistor density, Maxwell has better perf/W than Kepler.
But Tahiti, Cypress and RV770 also have higher transistor density compared to GK104, GF100 and GT200
yet they don't get smashed in perf/W or "clocks". In fact they even win.
So you see, higher transistor density is a conscious design choice, with obvious benefit of packing more *** in the same area.
Not something that automatically makes your GPU spew la-va.
Senior Member
Posts: 11808
Joined: 2012-07-20
Fiji has a higher transistor density than maxwell
I also don't think watercooling on Fiji is a boon, I hate aio, much rather air cooling or go all the way and invest in a proper liquid cooling kit
GTX 680 3.54 294 12.041
HD 7970 4.313 352 12.253
GTX 780Ti 7.08 561 12.62
r9-290x 6.2 438 14.155
980 5.2 398 13.065
Fury X 8.9 596 14.933
980 Ti 8.0 601 13.311
Pretty much sums it up. Both companies increased transistor density. Not because they invented magic trick, but because TSMC improved their process and reduced leakage at higher density.
But at any given time, transistor made for AMD was exactly same transistor as TSMC made for nVidia. With same leakage to density ratio. With same operational voltage. And with same Vdrop caused by leakage. And with same range at which transistor voltage is considered as 1 or 0 (simplified). And therefore with same voltage range at which transistor is in undetermined state. And same time required to get from 0 to 1 state with given leakage and voltage.
And this time which decides minimum period required for stable operation.
I hope that this helps in understanding how transistor density affects maximum clock.
But maybe someone is right in thinking that Fury X transistor density could have been used in December 2011 for HD 7970. And therefore HD 7970 could have had whooping 22% higher transistor density and lower cost.
And maybe someone can come with brilliant idea that TSMC makes different kind of transistors for nVidia which simply clock higher than those they have for AMD regardless of physics.
(But I am happy we do not have people like that here.)
Senior Member
Posts: 11808
Joined: 2012-07-20
8.9B Transistors on a 596mm2 die for Fury X
8.0B Transistors on a 601mm2 die for Titan X.
Idk, I get where everyone is coming from in general -- but I don't think heat density will be an issue. There is quite a large gap when going from WC to Air cooling on a GPU. My 980 went from like 78c loaded to 35c loaded. I'm sure they can improve air coolers further to match the heat dissipation necessary. And if not, just go with water cooling. Gamers will get over it. Fury X works fine with it, I haven't heard of any major issues and it was the first iteration.
Biggest problem and prevention of larger GPU's is going to be manufacturing costs. Both companies are going to want profitable chips and they are going to want it in similar price ranges to what we see now. I think it's pretty clear that we see out of the demos of Polaris, is that we are going to end up with Fury X/980Ti levels of performance at ~120w/300mm2 die chips. Hopefully architecture/small clock increases/more bandwidth can help the newer chips edge out of the older ones.
I personally don't even care if something bigger is coming. Tired of playing Division at Medium settings on my 980. I'm upgrading to whatever is the fastest single card that's out by the end of August. If Nvidia can't get Pascal out by then I'm not even considering them.
If better stuff comes out next year that's way faster I'll just upgrade again.
I Agree, except one thing. Those 600mm^2 28nm chips will be around 162mm^2 if done by GloFo 14nm. And around 175mm^2 if made by TSMC 16nm.
And their power consumption will be as high as cooling solution allows (within 300W standard). Because those technologies can clock damn high.
If Fiji was 14nm and 2.4GHz, it's power consumption would be around 515W. That's upper limit (without increasing voltage) and it would be pretty hard do cool.
For cooling Capacity imaginary scenario, 12mm * 14mm = 168mm^2. Which can be representative for both Fiji & GM200 die shrinks.
- What are GPUs/CPUs/ASICs with that size (or bit larger to give benefit of doubt to cooling)
- And what are their rated TDPs? Do we know any?
FX-9590 - 220W - 315mm^2: that's 220W in double area and poses challenge to many coolers. To have same thermal density, FX-9590 would have to eat 410W (Noctua rated NH-D14 as 220W TDP, therefore slightly sub-optimal)
i7-3770k - 77W - 160mm^2: Little mistake in using TIM instead of soldering heat spreader, and this 77W small chip could pose quite challenge to be kept cool. OC 4.6GHz eats around 155W, Temperature is kind of manageable 80~90°C in prolonged load. But definitely not Sandy kept below 55°C. (Found person testing i7-3770K @4.6GHz with D14 vs. Custom loop, both did hit 90°C in course of 12hour of prime95)
Since we are used to see GPUs running 70~80°C w/ air cooling, I way, 150W Power consumption is cool-able for 8~9B transistor count GPU as I do not expect Ivy-like TIM failure.
And important is clock GPU reaches at time it hits this (or another) cooling limit.
Considering polaris 14nm has between 2 and 3 times higher power efficiency than Maxwell 28nm. 8.9B Fiji at 14nm can eat around 90~100W (rough estimation in middle) @1050MHz, leaving space for higher clock. But not much till 150W (~1450MHz +-100MHz).