AMD's Future Developments: Ryzen 8000 and Navi 3.5

Published by

Click here to post a comment for AMD's Future Developments: Ryzen 8000 and Navi 3.5 on our message forum
https://forums.guru3d.com/data/avatars/m/164/164033.jpg
AMD usually has had process node advantage or at least parity. After a good while they do not which is currently the 7900 series. Most likely AMD also wasn't where they wanted to be versus their own cards either. But tbh if we remove the okish RT performance both 7900 are really good cards comparing to the new Nvidia also. Just that we expected more( at least I did except on RT side). Only big jump from last gen is the ultra massive 4090. Zen 5 has a good platform and it's a good release especially the big CPUs. Comparing to Intel. Also less power hungry. In current climate one won't go wrong with building AMD rig only either. As long as they don't buy 7600 the monilithic pile of garbage.
https://forums.guru3d.com/data/avatars/m/164/164033.jpg
Krizby:

RDNA3 uarch is a major failure, kinda like Vega. No idea how RTG can tackle that many problems (both uarch improvement and chiplets design) with way less R&D budget than Nvidia, perhaps just wishful thinking from RTG team.
Tbh I wouldn't say it's even remotely as bad as Vega. 7600 maybe but that isn't even taking advantage of any of the rdna3 stuff and is on a bigger node even.
https://forums.guru3d.com/data/avatars/m/216/216349.jpg
barbacot:

Maybe AMD idea of chiplets is the wrong one - maybe another architecture is needed and this one is not for the future but for the garbage bin.... Nvidia will go monolithic another generation at least and who knows? maybe they will drop the monolithic design for another thing - not chiplets, or not AMD idea of chiplets - maybe... Lots of speculations and I remembered that the root of evil for AMD GPU's is from ATI times - they genuinely believed that Nvidia 200 will be the last monolithic gpu from Nvidia - their big mistake: https://www.techpowerup.com/63216/ati-believes-geforce-gtx-200-will-be-nvidias-last-monolithic-gpu Also something to laugh - a blast from the past - someone seeing Intel chiplets gpu's as a "huge" treath for Nvidia: https://seekingalpha.com/article/4321159-nvidia-faces-huge-threat-from-intels-chiplet-gpu-approach I especially liked this one: 😀 That proves how wrong predictions are in this industry.
Yea, maybe AMD´s chiplet approach is not the better but i don`t think they can back down on it right now. As fro Nvidia, they have so much money right now that they can afford to continue with the monolithic approach and wait for the right time to implement a chiplet design.
Krizby:

RDNA3 uarch is a major failure, kinda like Vega. No idea how RTG can tackle that many problems (both uarch improvement and chiplets design) with way less R&D budget than Nvidia, perhaps just wishful thinking from RTG team.
I don`t think RDNA3 can considered a failure, at least so soon, but it`s off to a slow start and i don`t know if they can recover from it...
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
given the untapped clock headroom on the 7900 series gpus, if they can get the power consumption under control, significant uplift is on the table without much work, so a respin makes alot of sense. also, chiplets are inferior to monolith in all domains except price, that is the point, it significantly reduces cost, that is why they do it. moving the memory controllers and the cache to a separate die is a clever move, as it allows them to share silicon allocation between many different products, and reallocate them as needed much more easily and much faster. also leaves the door open to other kinds of products, for instance amd could build an HBM based MCD package, which could be " drop in" affair, without a creating a new die like with navi 12(hbm) vs navi 10(gddr6).
https://forums.guru3d.com/data/avatars/m/80/80129.jpg
Undying:

AMD will eventually fix all issues
♫ A tale as old as time ♫
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
user1:

given the untapped clock headroom on the 7900 series gpus, if they can get the power consumption under control, significant uplift is on the table without much work, so a respin makes alot of sense. also, chiplets are inferior to monolith in all domains except price, that is the point, it significantly reduces cost, that is why they do it. moving the memory controllers and the cache to a separate die is a clever move, as it allows them to share silicon allocation between many different products, and reallocate them as needed much more easily and much faster. also leaves the door open to other kinds of products, for instance amd could build an HBM based MCD package, which could be " drop in" affair, without a creating a new die like with navi 12(hbm) vs navi 10(gddr6).
i assume you meant for gpu's as chiplets are vastly superior for cpu. if you don't believe me look to Intel
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
cucaulay malkin:

except then 4 cores was a luxury high end item, while now 8/16 is pretty much the standard, and a competing 13700kf costs less, performs the same in games and almost 30% better in applications. actually, I'm wrong here. the competition is not 13700k, but a $330 13700 since 7800x3d is locked too and tops at 6000mhz ddr5.
13700.jpg
no, the Q6600 wasn't a luxury item. it was a productivity beast for the day and a mainstay of oem builds. in fact, it was only available w/ oem builds for the first six months (in US) after release. the Q6600 is the cpu that made DIY builds "a thing" instead of a tiny niche hobby
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
tunejunky:

i assume you meant for gpu's as chiplets are vastly superior for cpu. if you don't believe me look to Intel
even for cpus monolith is better, but again you run into prohibitive cost, a monolithic epyc chip would use less power for instance, and have lower latency, but the cost is way higher, even on mature nodes. https://occlub.ru/wp-content/uploads/2017/08/images_posts_news_2017_8_24_MCM-vs-monolithic-1.jpg this is a slide based on zen 1, if you would to do a chip like genoa, it would be an even greater disparity, >1000 mm^2 monolithic chips are possible, but the cost increase is massive, definitely not worth single digit percentage increases in performance.
https://forums.guru3d.com/data/avatars/m/282/282473.jpg
Com
tunejunky:

no, the Q6600 wasn't a luxury item. it was a productivity beast for the day and a mainstay of oem builds. in fact, it was only available w/ oem builds for the first six months (in US) after release. the Q6600 is the cpu that made DIY builds "a thing" instead of a tiny niche hobby
it was the highest core count you could get. look where 7800x3d is on that application performance index compared to 7950x, for productivity it's like last gen 5900x asnd slower than new i5, that is also unlocked by the way.
https://forums.guru3d.com/data/avatars/m/164/164033.jpg
cucaulay malkin:

Com it was the highest core count you could get. look where 7800x3d is on that application performance index compared to 7950x, for productivity it's like last gen 5900x asnd slower than new i5, that is also unlocked by the way.
Tho gotta admit from Q6600 we have come a long way too. In terms of just inflation. Tho 7800x3D is funny since it's slower than 7700x in producitivity. I guess the 3D stack kind of sucks for clocks. https://tpucdn.com/review/amd-ryzen-7-7800x3d/images/efficiency-gaming.png Where AMD currently does shine is efficiency tho. If one cares about it.
https://forums.guru3d.com/data/avatars/m/196/196426.jpg
user1:

a monolithic epyc chip would use less power for instance
Most likely it would be unproduceable. The defect rate would be so high with such a big and complex chip that it's quite likely none of them would make it out of the lithography machines intact, and even the highest model would be cut down. But the main reason for chiplets was explained above... cache and I/O have already stopped scaling. I wonder how many node shrinks do core transistors still have... I fear we are very very close to the "end of silicon"
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
wavetrex:

Most likely it would be unproduceable. The defect rate would be so high with such a big and complex chip that it's quite likely none of them would make it out of the lithography machines intact, and even the highest model would be cut down. But the main reason for chiplets was explained above... cache and I/O have already stopped scaling. I wonder how many node shrinks do core transistors still have... I fear we are very very close to the "end of silicon"
its possible to do, such chips exist, they are just ridiculously expensive, the kind of thing you find in a multimillion dollar mainframe, there are also "wafer scale" chips, which allow the entire water to be used, defects can be bypassed or areas disabled as mentioned. we still have ways to go on silicon, euv is kind of a reset, all of the tricks used to get the feature size smaller with duv, can be done again , sub 1nm is doable, then you have stacking, ect. the real question is how heat density is going to be dealt with. the end of silicon isn't something i would worry about for at least 10 years, and it will probably take much longer than that for a transition to occur.
https://forums.guru3d.com/data/avatars/m/243/243189.jpg
tunejunky:

no, the Q6600 wasn't a luxury item. it was a productivity beast for the day and a mainstay of oem builds. in fact, it was only available w/ oem builds for the first six months (in US) after release. the Q6600 is the cpu that made DIY builds "a thing" instead of a tiny niche hobby
Loved this chip. Straight from 1 core + HT to 4 cores was a dream, easy to OC to 3ghz as well.
data/avatar/default/avatar40.webp
Horus-Anhur:

The goal of the L3 cache on the big RDNA3 is not performance, but to save cost. The dies used on the L3 cache are still on N7, and unlike RDNA2, are separate chips. And there is a latency spike when hit hits the L3. Something that does not happen on RDNA2 or NAVI33. I still does what it's supposed to do, as a memory attached last level, it reduces fetches to vram. But for other things it won't do. For example, that cache latency would compromise performance on the whole GPU, if they used chiplets for an L2. Even for WGPs, it might be problematic as there would be the need to have a general front end, to issue work waves without contention.
This why NVIDIA's approach was to increase the amount of L2 cache in the RTX 4xxxx series. A larger L2 cache has more benefits than adding a larger L3 cache, as it sits closer to the actual compute cores and has a much lower latency. with AMD's approach you gain some performance compared to reading from VRAM, but the latency of L3 is typically at least 4 to 5x higher than that of L2.
https://forums.guru3d.com/data/avatars/m/248/248291.jpg
Crazy Joe:

This why NVIDIA's approach was to increase the amount of L2 cache in the RTX 4xxxx series. A larger L2 cache has more benefits than adding a larger L3 cache, as it sits closer to the actual compute cores and has a much lower latency. with AMD's approach you gain some performance compared to reading from VRAM, but the latency of L3 is typically at least 4 to 5x higher than that of L2.
No it's not. The L3 cache on RDNA2 and small RDNA3 are almost as fast as the L2 on Ampere. Ada Lovelace L2 latency is a bit higher than Ampere's, because of it's bigger size. And AMD's L2 cache has almost half the latency of the L2 on NVidia's GPUs. I don't think most people understand how good the cache system on AMD's GPUs really is.
7600_scalar_revised_1-1.jpg
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
user1:

its possible to do, such chips exist, they are just ridiculously expensive, the kind of thing you find in a multimillion dollar mainframe, there are also "wafer scale" chips, which allow the entire water to be used, defects can be bypassed or areas disabled as mentioned. we still have ways to go on silicon, euv is kind of a reset, all of the tricks used to get the feature size smaller with duv, can be done again , sub 1nm is doable, then you have stacking, ect. the real question is how heat density is going to be dealt with. the end of silicon isn't something i would worry about for at least 10 years, and it will probably take much longer than that for a transition to occur.
i get what you're saying but the realities of manufacturing contradict you. not all defects can be routed around and there are far too many variables to have any sort of production run - consistancy and precision are requirements. also fwiw, the most powerful mainframes in the world are made w/ AMD Genoa/Bergamo chiplet based cpus. seven of the top 10 supercomputers in the world, incl the top three are AMD based
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
tunejunky:

i get what you're saying but the realities of manufacturing contradict you. not all defects can be routed around and there are far too many variables to have any sort of production run - consistancy and precision are requirements. also fwiw, the most powerful mainframes in the world are made w/ AMD Genoa/Bergamo chiplet based cpus. seven of the top 10 supercomputers in the world, incl the top three are AMD based
I don't really contest any of that, things like "wafer scale" have numerous problems, but it can be done as there are chips in serial production that use it, its just not cost effective unless you can charge big money, servers compete in the sub <10k per chip arena, and there, doing such a thing doesn't make sense, but this isn't true for all markets.
data/avatar/default/avatar29.webp
tunejunky:

also fwiw, the most powerful mainframes in the world are made w/ AMD Genoa/Bergamo chiplet based cpus. seven of the top 10 supercomputers in the world, incl the top three are AMD based
I think your listings might be incorrect. Here is the latest supercomputer listing as of May 23.
https://forums.guru3d.com/data/avatars/m/164/164033.jpg
tunejunky:

i get what you're saying but the realities of manufacturing contradict you. not all defects can be routed around and there are far too many variables to have any sort of production run - consistancy and precision are requirements. also fwiw, the most powerful mainframes in the world are made w/ AMD Genoa/Bergamo chiplet based cpus. seven of the top 10 supercomputers in the world, incl the top three are AMD based
More like AMD has started to gain ground they do have 4 out of 10 in the top 10. Intel has 2, IBM 2, Sunway 1 and Fujitsu 1. And Nvidia has GPUs at least in 5 and AMD in 2.
data/avatar/default/avatar30.webp
Horus-Anhur:

No it's not. The L3 cache on RDNA2 and small RDNA3 are almost as fast as the L2 on Ampere. Ada Lovelace L2 latency is a bit higher than Ampere's, because of it's bigger size. And AMD's L2 cache has almost half the latency of the L2 on NVidia's GPUs. I don't think most people understand how good the cache system on AMD's GPUs really is.
7600_scalar_revised_1-1.jpg
Your graphs show Ampere vs RDNA3, not Ada Lovelace vs RDNA3. This is the graph you should have used: https://i0.wp.com/chipsandcheese.com/wp-content/uploads/2022/12/rdna3_cl_scalar_latency-2.png?ssl=1 As you can see, Ada Lovelace only loses the latency race between about 80 KB and 8000 KB, for the rest of the sizes it is either equally fast or faster.