Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
Corsair H170i Elite Capellix XT review
Forspoken: PC performance graphics benchmarks
ASRock Z790 Taichi review
The Callisto Protocol: PC graphics benchmarks
G.Skill TridentZ 5 RGB 6800 MHz CL34 DDR5 review
Be Quiet! Dark Power 13 - 1000W PSU Review
Palit GeForce RTX 4080 GamingPRO OC review
Core i9 13900K DDR5 7200 MHz (+memory scaling) review
Seasonic Prime Titanium TX-1300 (1300W PSU) review
F1 2022: PC graphics performance benchmark review

New Downloads
FurMark Download v1.33.0.0
Intel ARC graphics Driver Download Version: 31.0.101.4091
Corsair Utility Engine Download (iCUE) Download v4.33.138
CPU-Z download v2.04
AMD Radeon Software Adrenalin 23.1.2 (RX 7900) download
GeForce 528.24 WHQL driver download
Display Driver Uninstaller Download version 18.0.6.0
Download Intel network driver package 27.8
ReShade download v5.6.0
Media Player Classic - Home Cinema v2.0.0 Download


New Forum Topics
RTX 4090 Owner's thread NVIDIA GeForce 528.24 WHQL driver download & Discussion AMD Ryzen 7 7700X sees price drop to $299 Netflix threatens to ban customers who share an account unauthorized GeForce NVIDIA RTX 6000 with fully active AD102  Does Not Beat RTX 4090  in 3DMark AMD Software: Adrenalin Edition 23.1.2 for AMD Radeon™ RX 7900 Series Ambient Occlusion doesn't work on my laptop Review: ASRock Z790 Taichi motherboard Grab for free: Dishonored: Death of the Outider at Epic Games Store CORSAIR introduces the new VENGEANCE a8100 and i8100 gaming PCs




Guru3D.com » News » AMD Greenland Vega10 Silicon To Have 4096 Stream Processors?

AMD Greenland Vega10 Silicon To Have 4096 Stream Processors?

by Hilbert Hagedoorn on: 03/28/2016 09:39 AM | source: | 72 comment(s)
AMD Greenland Vega10 Silicon To Have 4096 Stream Processors?

Some interesting info surfaced on the web the past day or so. As you guys know, AMD is to release GPUs based on Polaris, however a recent AMD roadmap shows Vega and Navi Architectures. Vega is the successor to Polaris with HBM2 memory to be launched in the 2017 time-frame. An employee on their linked-in page noted Vega to have 4096 shader processors.

It seems that after Polaris in 2017 VEGA will make an appearance, the name is tagged with HBM2 meaning that HBM2 likely will not make it onto Polaris. Vega is the brightest star in the constellation Lyra, the fifth brightest star in the night sky and the second brightest star in the northern celestial hemisphere, after Arcturus. Next in line in the 2018 timeframe we see Navi with the two keywords being scalability and Nextgen memory.
  

 
Now the juicy part, The LinkedIn profile of Yu Zheng (who is an R&D manager at AMD) (info was removed btw) shows "shader processor" (stream processor) count of Vega10 to be 4,096, have a peek:
  


  

Interestingly enough, that's the same shader processor number as Fiji (Radeon R9 Fury X). Since the info does not state what model GPU (mid-range/high-end/enthusiast) it'll be used for it all remains guess work in terms of performance. We do know that perf per watt will increase significantly over Polaris and that these GPUs will be fitted with HMB2 graphics memory.

Since we are on the topic, "Polaris" architecture based names are oozing out. A high-end chip will be called "Ellesmere" or Polaris10. There will be a mid-range GPU called "Baffin" or Polaris11. And then "Ellesmere" is rumored to get 36 GCN 4.0 compute units, which works out to 2,304 stream processors; and a 256-bit wide bus, indicative for GDDR5/GDDR5X with 8 GB memory amount. 

It's going to be an interesting year.



AMD Greenland Vega10 Silicon To Have 4096 Stream Processors?




« GIGABYTE Adds GeForce GTX 960 4GB Xtreme Graphics Card with RGB Lighting · AMD Greenland Vega10 Silicon To Have 4096 Stream Processors? · Acer Predator 15 15.6-Inch 4K Gaming Notebook »

15 pages « < 4 5 6 7 > »


Fox2232
Senior Member



Posts: 11808
Joined: 2012-07-20

#5250927 Posted on: 03/28/2016 06:08 PM
I understand what you mean, but clocks are still important, not alone, but in the context of the design.

Im not saying nvidia will clock higher than amd or vice versa, I'm saying I care about the throughout, whether it's achieved through a big slow design, or a fast small design doesn't affect me in the slightest.

The only way I can justify the 8x thermal density is if 14nm transistors consume double the 28nm ones, which would be pretty funny. 4x transistor density, 2x power = 8x thermal density. How I actually think it is : 4x density 0.5x power

Just saw your comment Fox, okay yeah if you're talking 150% higher clocks then it makes sense, it's just that... That's not gonna happen
That clock can't happen because we do not have cooling for it. And it was to support idea, that each 14 and 16nm will hit certain clock at given point in time and technological advancement, and OC will be very difficult to achieve.
And that meant both AMD and nVidia are going to end up with much more similar GPU clocks than they did on 28nm. 25% clock difference is very improbable to happen again. (28nm limitation for both AMD and nVidia came from transistor density which limited stability, 14nm clock limitation will come from cooling)

Now what I meant by nVidia having to do more work on die shrink.
980Ti: 8B transistors, 1350MHz clock
Fiji: 8.9B transistors: 1050MHz clock
We can say that 980Ti performs 5% better in average.
Therefore: 980Ti perf = 1.05 * Fiji perf *
8 * 1350 = 1.05 * 8.9 * 1050
1.1 =
In other words, if GM200 and Fiji had same clock and were limited to same transistor count, Fiji would deliver 10% higher performance. And that's while Fiji is well rounded GPU with no significant weak points like low compute performance, or pixel shader length limitation.

Taking Fiji has higher performance per clock per transistor and having more transistors than GM200. Doing just die shrink and hitting similar clock for both chips would make 14nm Fiji much better. And as I stated before, that's why both companies continue to improve. AMD knows they have certain inherited advantage which they can play. You can see their confidence at presentations. But if they sit on their hands, nVidia will take that advantage from them as it is not that big.

Ieldra
Senior Member



Posts: 3490
Joined: 2007-01-27

#5250935 Posted on: 03/28/2016 06:27 PM
That clock can't happen because we do not have cooling for it. And it was to support idea, that each 14 and 16nm will hit certain clock at given point in time and technological advancement, and OC will be very difficult to achieve.
And that meant both AMD and nVidia are going to end up with much more similar GPU clocks than they did on 28nm. 25% clock difference is very improbable to happen again. (28nm limitation for both AMD and nVidia came from transistor density which limited stability, 14nm clock limitation will come from cooling)

Now what I meant by nVidia having to do more work on die shrink.
980Ti: 8B transistors, 1350MHz clock
Fiji: 8.9B transistors: 1050MHz clock
We can say that 980Ti performs 5% better in average.
Therefore: 980Ti perf = 1.05 * Fiji perf *
8 * 1350 = 1.05 * 8.9 * 1050
1.1 =
In other words, if GM200 and Fiji had same clock and were limited to same transistor count, Fiji would deliver 10% higher performance. And that's while Fiji is well rounded GPU with no significant weak points like low compute performance, or pixel shader length limitation.

Taking Fiji has higher performance per clock per transistor and having more transistors than GM200. Doing just die shrink and hitting similar clock for both chips would make 14nm Fiji much better. And as I stated before, that's why both companies continue to improve. AMD knows they have certain inherited advantage which they can play. You can see their confidence at presentations. But if they sit on their hands, nVidia will take that advantage from them as it is not that big.

Pretty sure 980Ti is 1200mhz stock, and again, it makes no sense to me that you compare at clock parity when a Fiji cores caps out at 1150 and maxwell can do 1500. Everything about next gen clocks is conjecture, we just assume they will be higher than 28nm, and Im agreeing that thermals will be an issue, I just have no idea how to follow the logic that leads you to making statements about Fiji vs maxwell. Yeah Fiji is faster per clock, it also has 1300 more SPs... And runs 400mhz slower

Fox2232
Senior Member



Posts: 11808
Joined: 2012-07-20

#5250945 Posted on: 03/28/2016 06:50 PM
Pretty sure 980Ti is 1200mhz stock, and again, it makes no sense to me that you compare at clock parity when a Fiji cores caps out at 1150 and maxwell can do 1500. Everything about next gen clocks is conjecture, we just assume they will be higher than 28nm, and Im agreeing that thermals will be an issue, I just have no idea how to follow the logic that leads you to making statements about Fiji vs maxwell. Yeah Fiji is faster per clock, it also has 1300 more SPs... And runs 400mhz slower

SP is just way AMD/nV invested transistors (same could be said: that GM200 has 32 more ROPs = 50% more than Fiji). Comparison is performance per transistor per clock, because those are limiting factors of manufacturing technologies.
And I can be wrong, maybe GPU frequency will not be limited by heat. Because Heat can be adjusted partly by transistor density. But that equals cost.

Edit: as for clocks a lot of 980Ti owners claimed here that their card boosts to 1350~1450 without them doing anything. But that's probably per manufacturer. Some shops list official nVidia material value, other list those higher boost clocks. Maybe you can disable OC and see how your card boosts.

Noisiv
Senior Member



Posts: 8192
Joined: 2010-11-16

#5250947 Posted on: 03/28/2016 06:57 PM

Taking Fiji has higher performance per clock per transistor and having more transistors than GM200. Doing just die shrink and hitting similar clock for both chips would make 14nm Fiji much better.

Hitting similar clocks! Just like that...

Does that mean that Nvidia gets ****load of shaders, bazillion TFLOPS, async compute, True Audio and whatnot... because clean slate!?
Or is clean slate reserved only for clocks, and everything else remains relatively the same?

If hitting same clocks with GCN was that easy, AMD would do it on 28HPM ya know...




Now what I meant by nVidia having to do more work on die shrink.


Nvidia having more to more work on die shrink? This is obviously wrong.
Considering that not even using high-end HBM was enough to catch up with 980Ti, obviously AMD had much bigger homework to do.
Kind of what NV has already done Kepler -> Maxwell.


980Ti: 8B transistors, 1350MHz clock
Fiji: 8.9B transistors: 1050MHz clock
We can say that 980Ti performs 5% better in average.
Therefore: 980Ti perf = 1.05 * Fiji perf *
8 * 1350 = 1.05 * 8.9 * 1050
1.1 =
In other words, if GM200 and Fiji had same clock and were limited to same transistor count, Fiji would deliver 10% higher performance.


Exactly. There is only that little
if

Taking design that clocks poorly, relatively speaking, and saying that
all they need to do clock it same as that high-clock design, is beyond ridiculous.

Ieldra
Senior Member



Posts: 3490
Joined: 2007-01-27

#5250948 Posted on: 03/28/2016 06:59 PM
SP is just way AMD/nV invested transistors (same could be said: that GM200 has 32 more ROPs = 50% more than Fiji). Comparison is performance per transistor per clock, because those are limiting factors of manufacturing technologies.
And I can be wrong, maybe GPU frequency will not be limited by heat. Because Heat can be adjusted partly by transistor density. But that equals cost.


I see what you mean, but it still makes no sense not to account for clocks when talking about relative performance. By that token a 390X is miles faster than a 980, yet it's consistently outperformed by it, especially at a moderate 1500mhz

Clocks matter.


Hitting similar clocks! Just like that...


Taking design that clocks poorly, relatively speaking, and saying that
all they need to do clock it same as that high-clock design, is beyond ridiculous.

Something else I want to point out regarding the popular view that AMD somehow better support for future apis, even in ashes of the singularity, AMD's best case scenario, a 980Ti outperforms a fury X when overclocked, despite 'async' being detrimental to performance in this game. Clocks matter.

AMD has gotten a lot better in many respects recently, doesn't say anything at all about nvidia

15 pages « < 4 5 6 7 > »


Post New Comment
Click here to post a comment for this news story on the message forum.


Guru3D.com © 2023