Ryzen 7 8700G AI Performance Enhances with DDR5-9000, 15% Improvement in Integrated Graphics

Published by

Click here to post a comment for Ryzen 7 8700G AI Performance Enhances with DDR5-9000, 15% Improvement in Integrated Graphics on our message forum
data/avatar/default/avatar14.webp
I used to run DDR3 sticks at those voltages. That aside, it's quite impressive. I wonder if they could make a HEDT variant with quad-channel memory and some 3D cache available to the GPU to increase speeds even further.
data/avatar/default/avatar36.webp
Since the integrated GPU is using system memory for its needs, its not surprising that higher bandwidth memory will boost its performance. With the clock frequency boost of around 15% its expected that the bandwidth would also scale up with 15%, so the results as attained are not that surprising. Of course the memory will not last very long with the required voltages to achieve these boosted clock speeds, so this is a nice exercise in what could possibly be achieved in the future. Whether or not there will be a memory manufacturer that can produce stable RAM at those clock speeds remains to be seen.
https://forums.guru3d.com/data/avatars/m/246/246171.jpg
Crazy Joe:

Since the integrated GPU is using system memory for its needs, its not surprising that higher bandwidth memory will boost its performance. With the clock frequency boost of around 15% its expected that the bandwidth would also scale up with 15%, so the results as attained are not that surprising. Of course the memory will not last very long with the required voltages to achieve these boosted clock speeds, so this is a nice exercise in what could possibly be achieved in the future. Whether or not there will be a memory manufacturer that can produce stable RAM at those clock speeds remains to be seen.
Yeah, it is a little concerning how even with all the improvements to the architecture and the swap to DDR5, the GPU is still entirely bottlenecked by RAM. AMD desperately needs to either add more memory channels (which ain't happening with AM5) or introduce a V-cache for the iGPU. I'm quite positive once we get close to the release of DDR6, we'll see DDR5 modules at 9000MT/s with sustainable voltages.
data/avatar/default/avatar01.webp
physx:

That aside, it's quite impressive. I wonder if they could make a HEDT variant with quad-channel memory and some 3D cache available to the GPU to increase speeds even further.
A HEDT variant like that would be faster, but it might not beat a RX 7600 anyway, so the big socket, bigger cooler, double amount of memory and more expensive mainboard, just to maybe match a 300$ GPU in one task. If someone really wants HEDT speed like that, AMD already has the instinct series.
https://forums.guru3d.com/data/avatars/m/268/268248.jpg
TLD LARS:

A HEDT variant like that would be faster, but it might not beat a RX 7600 anyway, so the big socket, bigger cooler, double amount of memory and more expensive mainboard, just to maybe match a 300$ GPU in one task. If someone really wants HEDT speed like that, AMD already has the instinct series.
I second that ... Not to mention the igpu can not beat even the 6500xt ...while 4 channel memory might allow it to get in a breath away or even beat the 6500xt the platform cost to get that kind of performance is not worth it at all.
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
Venix:

I second that ... Not to mention the igpu can not beat even the 6500xt ...while 4 channel memory might allow it to get in a breath away or even beat the 6500xt the platform cost to get that kind of performance is not worth it at all.
it would never beat the 6500 xt simply because the 6500xt has 33% more shaders and most importantly 16mb of l3 cache, the increased clockspeeds arent enough to close that gap, nor would a quad channel interface. Imo the ddr5 speeds are high enough already, but there needs to be an l3 cache for the gpu , the small 2mb l2 cache on the igp simply isnt enough , you can buy 9600mbps lpddr5 ics (which would equate to 153gb/s in a 2 dimm configuration , faster chips coming down the pipe as well. so quad channel isnt really necessary anymore.
data/avatar/default/avatar30.webp
user1:

it would never beat the 6500 xt simply because the 6500xt has 33% more shaders and most importantly 16mb of l3 cache, the increased clockspeeds arent enough to close that gap, nor would a quad channel interface. Imo the ddr5 speeds are high enough already, but there needs to be an l3 cache for the gpu , the small 2mb l2 cache on the igp simply isnt enough , you can buy 9600mbps lpddr5 ics (which would equate to 153gb/s in a 2 dimm configuration , faster chips coming down the pipe as well. so quad channel isnt really necessary anymore.
The price of that memory in a 48 or 64GB config (because shared memory) and a very good 2 dimm mainboard could still be more expensive then a 6500XT or 7600. Memory faster then 153GB/s might be needed because it is hared for the CPU and GPU at the same time and there might be higher latency from GPU to system memory. Something like the 7600 with below 300$ MSRP is already at memory speeds of 288GB/s, that is a couple of years away if trying to match that with DDR 6.
https://forums.guru3d.com/data/avatars/m/270/270041.jpg
schmidtbag:

Yeah, it is a little concerning how even with all the improvements to the architecture and the swap to DDR5, the GPU is still entirely bottlenecked by RAM. AMD desperately needs to either add more memory channels (which ain't happening with AM5) or introduce a V-cache for the iGPU. I'm quite positive once we get close to the release of DDR6, we'll see DDR5 modules at 9000MT/s with sustainable voltages.
I don't think we will ever see v-cache or ram on the die. they seem to only ever want to make these chips as a "low end" part. otherwise in theory we could get APU's like those on the ps5 and xbox. the swap to DDR5 also doesn't matter, the required ram for good graphics performance has well exceeded that of system ram. DDR5 is great, but it's still far worse than GDDR6X and especially GDDR7 set to come soon if we ever got one of these with HBM on the die, maybe then? but shader count is so low i half expect the gains to be quite small
https://forums.guru3d.com/data/avatars/m/248/248291.jpg
It's unfortunate that these APUs have the same limitation with the Infinity Fabric, as other Zen5 CPUs, with only one CCD. This is why memory read speeds are at 78GB/s, while write speeds are much higher, at 126Gb/s. These APUs do manage to get higher IF clock speeds. That is where those 15% extra performance come from.
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
TLD LARS:

The price of that memory in a 48 or 64GB config (because shared memory) and a very good 2 dimm mainboard could still be more expensive then a 6500XT or 7600. Memory faster then 153GB/s might be needed because it is hared for the CPU and GPU at the same time and there might be higher latency from GPU to system memory. Something like the 7600 with below 300$ MSRP is already at memory speeds of 288GB/s, that is a couple of years away if trying to match that with DDR 6.
im gonna say the difference between a 7600 and a 6500xt is alot, something like a 7600 is probably 2 or 3 generations of apus away, most am5 boards today are already capable of pushing 8000mt/s thanks to amd's memory controller wizardry, even cheap b650boards can post at such speeds, its just a matter of optimization and standardization. i shall remind that these are first generation products and the current support for high speeds is basically an afterthought ,this year we will probably see faster chips come around, so 200gb/s+ from 2 dimms within the next 2 years is very likely. also there is no latency hit from the shared memory controller infact ddr tends to be lower latency than gddr in general, system memory has for a long time favored latency over bandwidth, because it is more important for cpu performance, where as gddr favors bandwidth instead, sacrificing latency. things are changing a bit , by way of large caches, but it still remains true for the most part. ddr punches a bit above its weight class vs gddr with comparable bandwidth eg, the 780m performs similarly to an rx 470, and the cezanne vega igp ,the rx 460, despite having similar sized caches , similar shader performance and about half the bandwidth at standard ddr5/ddr4 speeds. , they tend to perform alot better than you would expect given the meager bandwidth. 100gb/s with ddr is going to perform better than 100gb/s with gddr generally speaking. so looking at what is achievable with the new apus, its definitely enough for 6500xt tier performance, but without the cache its not going to catch up because the 6500xt is already a bandwidth starved product itself.
Horus-Anhur:

It's unfortunate that these APUs have the same limitation with the Infinity Fabric, as other Zen5 CPUs, with only one CCD. This is why memory read speeds are at 78GB/s, while write speeds are much higher, at 126Gb/s. These APUs do manage to get higher IF clock speeds. That is where those 15% extra performance come from.
they actually probably dont have that problem, speaking from my experience with a cezanne apu, the gpu side of the memory controller has its own data fabric which is separate, aka if you set an fclk of 2000 , and increase the memory speed beyond 4000mt/s you will still see gpu bandwith increase , while cpu bandwidth remains the same, i discovered this when memory overclocking and using a benchmark called poclmembench(which runs on the gpu), and accidentally dsyncing the fabric.
https://forums.guru3d.com/data/avatars/m/248/248291.jpg
user1:

they actually probably dont have that problem, speaking from my experience with a cezanne apu, the gpu side of the memory controller has its own data fabric which is separate, aka if you set an fclk of 2000 , and increase the memory speed beyond 4000mt/s you will still see gpu bandwith increase , while cpu bandwidth remains the same, i discovered this when memory overclocking and using a benchmark called poclmembench(which runs on the gpu), and accidentally dsyncing the fabric.
That's because you are using a Zen3 APU. This issue is with Zen CPUs. Just look at the memory speeds HH posted. And Zen4 does not have the same penalty as Zen3, with de-synced IF and memory.
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
Horus-Anhur:

That's because you are using a Zen3 APU. This issue is with Zen CPUs. Just look at the memory speeds HH posted. And Zen4 does not have the same penalty as Zen3, with de-synced IF and memory.
you misunderstand the gpu portion of the apu has its own data fabric, , so no matter what the fclk is , the bandwidth to the gpu side is uneffected, I sincerely doubt amd would drop this design feature given that it is also present on the rebrandt apus as well which use ddr5 aswell. this is intentional as it helps them save power on laptops. in summary aida bandwidth numbers are cpu only, the gpu bandwidth numbers will be different.
https://forums.guru3d.com/data/avatars/m/248/248291.jpg
user1:

you misunderstand the gpu portion of the apu has its own data fabric, , so no matter what the fclk is , the bandwidth to the gpu side is uneffected, I sincerely doubt amd would drop this design feature given that it is also present on the rebrandt apus as well which use ddr5 aswell. this is intentional as it helps them save power on laptops. in summary aida bandwidth numbers are cpu only, the gpu bandwidth numbers will be different.
Where did you see that?
data/avatar/default/avatar37.webp
user1:

im gonna say the difference between a 7600 and a 6500xt is alot, something like a 7600 is probably 2 or 3 generations of apus away, most am5 boards today are already capable of pushing 8000mt/s thanks to amd's memory controller wizardry, even cheap b650boards can post at such speeds, its just a matter of optimization and standardization. i shall remind that these are first generation products and the current support for high speeds is basically an afterthought ,this year we will probably see faster chips come around, so 200gb/s+ from 2 dimms within the next 2 years is very likely. also there is no latency hit from the shared memory controller infact ddr tends to be lower latency than gddr in general, system memory has for a long time favored latency over bandwidth, because it is more important for cpu performance, where as gddr favors bandwidth instead, sacrificing latency.
I am mentioning the 7600 because that is the slowest AMD current gen GPU with regular non clearance prices. I do want to go back to the question why would someone want to run a APU like this? what benefit does it have? 1 The power needed to run a 6500XT / 7600 like GPU, is going to be moved to the CPU socket, so the socket is now at something like 200-250W, making it impossible to compact the pc more then for example a normal 7800X3D and 7600 combo, because a tower or AIO cooler is a requirement. 2 The cost of 9000-10000 memory is more then twice of a 6000 kit. It will be different in 2 years, but you still end up with 6500XT / 7600 performance in 2 years from now, Impressive for a APU, but not impressive for a gaming pc in general. 3 A 6500XT / 7600 GPU build in will double the die space compared to a normal 7800X3D, so it might not even physically fit in a AM5 socket even if the socket and VRM allows for the doubling of power usage. 4 The latency I am mentioning is because the GPU needs to talk to the CPU memory controller. It could maybe use the data link for a CCD to keep the speed up, but then the CPU part is locket to 1 CCD unless the IO die is expanded to work with 3 data connections. There will still be a "waiting line" if the CPU and GPU is accessing memory at the same time.
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
Horus-Anhur:

Where did you see that?
apart from direct observation of this, if you look at benchmarks of the 6000 series apus, you will find that aida will also show a max bandwidth of 55gb/s despite using lpddr5 6400, which is very low and does not reflect the performance uplift provided by the faster memory, i also recall seeing a screenshot on overclock.net of someone running a memory benchmark on the igp of a 7000 series chip which showed the higher memory bandwidth , aka reads and writes much higher than the cpu side. this makes a lot of sense in laptops , amd can keep the fclk low to save power and run the umc in 1:2 mode while getting the benefit of the faster memory for the gpu portion, from my own testing with the igpu , keeping the fclk at 2000 and running ddr4 4800 in desynced mode was slightly faster than running 2300:4600 , if i ran a memory benchmark it would show the cpu side at around 60gb/s and the gpu side at 69gb/s, there is a cpu latency hit from doing this, but it runs much cooler and requires much less voltage. the thing that really confirmed this for me was that the data fabric clock for the igpu can be read on linux, and it runs at a different speed than the fclk
https://forums.guru3d.com/data/avatars/m/248/248291.jpg
user1:

apart from direct observation of this, if you look at benchmarks of the 6000 series apus, you will find that aida will also show a max bandwidth of 55gb/s despite using lpddr5 6400, which is very low and does not reflect the performance uplift provided by the faster memory, i also recall seeing a screenshot on overclock.net of someone running a memory benchmark on the igp of a 7000 series chip which showed the higher memory bandwidth , aka reads and writes much higher than the cpu side. this makes a lot of sense in laptops , amd can keep the fclk low to save power and run the umc in 1:2 mode while getting the benefit of the faster memory for the gpu portion, from my own testing with the igpu , keeping the fclk at 2000 and running ddr4 4800 in desynced mode was slightly faster than running 2300:4600 , if i ran a memory benchmark it would show the cpu side at around 60gb/s and the gpu side at 69gb/s, there is a cpu latency hit from doing this, but it runs much cooler and requires much less voltage. the thing that really confirmed this for me was that the data fabric clock for the igpu can be read on linux, and it runs at a different speed than the fclk
Again. Zen3 APUs are not Zen4 APUs. And what you suggest, would imply 2 IFs, one for the CPU part, and another for the iGPU. That does not seem reasonable.
https://forums.guru3d.com/data/avatars/m/246/246171.jpg
Ricepudding:

I don't think we will ever see v-cache or ram on the die. they seem to only ever want to make these chips as a "low end" part. otherwise in theory we could get APU's like those on the ps5 and xbox. the swap to DDR5 also doesn't matter, the required ram for good graphics performance has well exceeded that of system ram. DDR5 is great, but it's still far worse than GDDR6X and especially GDDR7 set to come soon if we ever got one of these with HBM on the die, maybe then? but shader count is so low i half expect the gains to be quite small
At 64MB of cache, it will still remain a very low-end part, but it'll be seen as more worthwhile to anyone who wants an entry level system. Despite the huge improvements, this still isn't good enough. HBM would be a nice alternative but too expensive for the use case.
data/avatar/default/avatar11.webp
i think the accumulated cpu temps are hurting performance
https://forums.guru3d.com/data/avatars/m/266/266726.jpg
Horus-Anhur:

Again. Zen3 APUs are not Zen4 APUs. And what you suggest, would imply 2 IFs, one for the CPU part, and another for the iGPU. That does not seem reasonable.
you keep saying that zen3 apus are not zen4 apus , which while is true, completely ignores the fact that they are extremely similar, especially rembrandt, the main difference between zen4 and zen 3 is the faster datapath and avx512 support, and thats about it. we have at least 3 amd products with rdna2 +ddr5 with Different core architectures, mendicino uses zen2 +rdna 2/ddr5, rembrandt zen3+rdna2/ddr5 , the IOD+zen4 chiplet, and now we have phoenix apus, which have zen4 +rdna3/ddr5, its mix and match. if rembrandt has this quirk you can almost guarantee that the rest of them do too. Obviously anything short of actually running a memory benchmark on the new igp is a guess, so if your unsatisfied with an educated guess based on amds existing very recent products, then wait for those benchmarks. all I can say is that I wouldn't assume that The igp is effected by the fabric clock in the bios, since the last 3 generations of apus(renoir, cezanne, rembrandt ect) aren't setup this way. ( another way to think about this , is that the corecomplex has its own infinity fabric links to the UMC , as does the GPU, and that the data rate differs between them. which you might expect given that halving your reads or writes would have a pretty negative effect on gpu performance, unlike the core complex. note that in this situation infinity fabric coherency is maintained, multiple infinity fabric links are used regularly on amd's products so this isn't that far of a stretch)