Rumor: AMD Epyc2 processors could get 64 cores over 8+1 dies

Published by

Click here to post a comment for Rumor: AMD Epyc2 processors could get 64 cores over 8+1 dies on our message forum
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
azraei97:

For now AMD still can't do multi chip GPU working as work like in the CPU. AMD already talk about this. Navi also unlikely to get multichip design https://www.pcgamesn.com/amd-navi-monolithic-gpu-design
lol. not accurate at this time. right now there's engineering samples in Sunnyvale. Navi isn't this, Vega isn't this,we're talking Arcturus and while it's nowhere near ready for the market, it's getting the bugs worked out (i.e. module finalization, controller ops, and memory matching). and btw...there's going to be a super sweet Ryzen/Navi SoC for laptops that will bring serious gaming down in price by several hundred dollars over the I-9/2070 (Alienware) laptops about to drop.
https://forums.guru3d.com/data/avatars/m/243/243702.jpg
tunejunky:

lol. not accurate at this time. right now there's engineering samples in Sunnyvale. Navi isn't this, Vega isn't this,we're talking Arcturus and while it's nowhere near ready for the market, it's getting the bugs worked out (i.e. module finalization, controller ops, and memory matching). and btw...there's going to be a super sweet Ryzen/Navi SoC for laptops that will bring serious gaming down in price by several hundred dollars over the I-9/2070 (Alienware) laptops about to drop.
I have to agree. AMD had MCM GPUs long time ago. But they are not feasible for whole market. It works for Compute. But there is not sufficient benefit outside of Compute. And Compute itself does not need MCM as it does not overcome some particular problem. If AMD had very power efficient GPUs and even biggest chip they made could not reach 300W PCIe standard, then MCM would allow saturation of performance&power per card.
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
Fox2232:

I have to agree. AMD had MCM GPUs long time ago. But they are not feasible for whole market. It works for Compute. But there is not sufficient benefit outside of Compute. And Compute itself does not need MCM as it does not overcome some particular problem. If AMD had very power efficient GPUs and even biggest chip they made could not reach 300W PCIe standard, then MCM would allow saturation of performance&power per card.
power delivery and regulation are more difficult than you would normally think at 7nm, nothing insurmountable, but initial run will have a higher cost more associated with beefier boards than the 7nm process. the expense of MCM is negligible @ 7nm with the higher associated yields and lower costs. the freight of the new process is being carried by the iPhone/iPad bionic A12 processor. right now TSMC is hitting economies of scale...six months early...which is why there's already ES of Navi, early ES of Arcturus, and Ryzen 2 today in Sunnyvale.
https://forums.guru3d.com/data/avatars/m/258/258688.jpg
Picolete:

The controller it's rumored to be 14nm because it's bigger than the other chips and the process is not mature enough for such a large chip on 7nm and decent yields
Well, my point was only that the controller at 7nm would be nowhere near as large as it is at 14nm, and at that point physical bus connections to the a smaller 7nm cpu cores would be problematic, per the diagram. As far as yields, go, I'm fairly certain that each 7nm cpu core (8/16) as pictured is more complex circuit-wise than the system controller, and so if the cpus come in at a good yield @ 7nm then the system controller, being less complex, would also yield well at 7nm--maybe even better than the cpus. However, as I mentioned, using the 14nm system controller would be less expensive than using one at 7nm, and if the diagram is true to scale, then a 14nm system controller would better facilitate physical connection of the cpu cores--again, per the pictured diagram. Just some idle speculation about the physical bus layout of the cpu cores connecting to the system controller, meh...;) I'm probably all wet...!
https://forums.guru3d.com/data/avatars/m/63/63170.jpg
One thing, is that you would probably have to use edge interposers, as the size of a full interposer would probably be too big. Then again, as per the diagrams, it would be possible to do it on an organic substrate. Not sure what the interconnect speed impact would be, even if a little (compared to an interposer.). edit : one thing I just thought of is that interposers allow very larges buses, ie 1024-4096bit wide, usually HBM, and organic substrate would allow a lot less ? like the 512bit bus of DDR type cards ?
data/avatar/default/avatar04.webp
waltc3:

Well, my point was only that the controller at 7nm would be nowhere near as large as it is at 14nm, and at that point physical bus connections to the a smaller 7nm cpu cores would be problematic, per the diagram. As far as yields, go, I'm fairly certain that each 7nm cpu core (8/16) as pictured is more complex circuit-wise than the system controller, and so if the cpus come in at a good yield @ 7nm then the system controller, being less complex, would also yield well at 7nm--maybe even better than the cpus. However, as I mentioned, using the 14nm system controller would be less expensive than using one at 7nm, and if the diagram is true to scale, then a 14nm system controller would better facilitate physical connection of the cpu cores--again, per the pictured diagram. Just some idle speculation about the physical bus layout of the cpu cores connecting to the system controller, meh...;) I'm probably all wet...!
You mention correctly that maybe size could be too small to accommodate all external connections, however...there are more to it. Not all chip parts scale same, compute cores and cache is one of those that scale very well, however memory controller and buses are those that are known to scale badly and are also more sensitive to defects. Getting defect in compute core or its local cache is actually best case here as you can simply turn that part off and sell it as lower SKU, but if you get defect anywhere in uncore (controller) its pretty much game over for that chip, unless it hits part that is doubled (doubling certain parts like long path ways is used for increasing yields).
https://forums.guru3d.com/data/avatars/m/271/271877.jpg
xrodney:

You mention correctly that maybe size could be too small to accommodate all external connections, however...there are more to it. Not all chip parts scale same, compute cores and cache is one of those that scale very well, however memory controller and buses are those that are known to scale badly and are also more sensitive to defects. Getting defect in compute core or its local cache is actually best case here as you can simply turn that part off and sell it as lower SKU, but if you get defect anywhere in uncore (controller) its pretty much game over for that chip, unless it hits part that is doubled (doubling certain parts like long path ways is used for increasing yields).
Maybe an uncore controller with doubled parts for Epyc that can be disabled (or faulty) and used for Threadripper and Ryzen to save costs, will be the solution. Anyway, the interposer is still expensive on Vega final product. They need to make it cheaper before. Maybe conecting one CCX to a controller chiplet is aesy enough and letting the more complex substrate for servers who want to pay for it, will work. But Zen2 is already done, and we won't see these 8+1 solutions until Zen3 cores and new AM5 socket in 2020... I don't know, but it's fun to speculate 😛 Edit: Intel already talked about chiplets with diferent litographies bound together.
data/avatar/default/avatar17.webp
Luc:

Maybe an uncore controller with doubled parts for Epyc that can be disabled (or faulty) and used for Threadripper and Ryzen to save costs, will be the solution. Anyway, the interposer is still expensive on Vega final product. They need to make it cheaper before. Maybe conecting one CCX to a controller chiplet is aesy enough and letting the more complex substrate for servers who want to pay for it, will work. But Zen2 is already done, and we won't see these 8+1 solutions until Zen3 cores and new AM5 socket in 2020... I don't know, but it's fun to speculate 😛 Edit: Intel already talked about chiplets with diferent litographies bound together.
That's funny, because none except AMD knows if and how they do Zen2 based 64 core Epyc. Unless you are telepat or pythoness that can see future, I would hold back on definitive statements about CPUs that none knows details. Zen2 based EPYC may or may not use uncore with core chiplets, we just simple do not know yet, but its something AMD have in plans and it would be best way to increase yields, reduce latencies and get rid of NUMA. As for interposer, Epyc size interposer can cost maybe $30 and for 1+1 chip it might be less than $10. Interposer was not reason for higher cost of vega, thats GPU itself and HBM. Only reason for AM5 would be if AMD want to bring DDR5 or PCI-e gen 4, but... PCI-e gen 4 was mentioned to be short lived (likely used just in enterprise market) and soon to be replaced by gen5 and for DDR5 its way too early. They also promised that AM4 will stays for at least 5 years.
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
Evildead666:

One thing, is that you would probably have to use edge interposers, as the size of a full interposer would probably be too big. Then again, as per the diagrams, it would be possible to do it on an organic substrate. Not sure what the interconnect speed impact would be, even if a little (compared to an interposer.). edit : one thing I just thought of is that interposers allow very larges buses, ie 1024-4096bit wide, usually HBM, and organic substrate would allow a lot less ? like the 512bit bus of DDR type cards ?
we're talking "Infinity Fabric" (tm) as the interposer. nothing currently on the market comes close, especially in latency. organic substrates are not proven at this scale of production and offer too many variables. "Infinity Fabric" is going to change all of our lives...for real, no hyperbole, over the course of the next five years.
https://forums.guru3d.com/data/avatars/m/63/63170.jpg
xrodney:

That's funny, because none except AMD knows if and how they do Zen2 based 64 core Epyc. Unless you are telepat or pythoness that can see future, I would hold back on definitive statements about CPUs that none knows details. Zen2 based EPYC may or may not use uncore with core chiplets, we just simple do not know yet, but its something AMD have in plans and it would be best way to increase yields, reduce latencies and get rid of NUMA. As for interposer, Epyc size interposer can cost maybe $30 and for 1+1 chip it might be less than $10. Interposer was not reason for higher cost of vega, thats GPU itself and HBM. Only reason for AM5 would be if AMD want to bring DDR5 or PCI-e gen 4, but... PCI-e gen 4 was mentioned to be short lived (likely used just in enterprise market) and soon to be replaced by gen5 and for DDR5 its way too early. They also promised that AM4 will stays for at least 5 years.
PCIe 4 will be arriving, and hanging around. there is no reason for it to be short lived. PCIe 4 can be done on AM4(+?), the number of Pins dosn't change, and it will be backward compatible with at least PCIe 3, if not all the way back. gen 5 has no info at all about even engineering samples being ready or tested. it will probably be "ready" and hanging around waiting for PCIe 4 to give up its seat 😉 In EPYC, intra-CPU Infinity Fabric is done over 64 PCIe 3 lanes. pop that to PCIe 4, and you can do the same with 32 lanes, or double the bandwidth. Just for that it's worth it. (I agree, same could be said for going to PCIe 5, but its not even close). I'm pretty sure the interposer isn't that cheap 😉 and its a pain to create at that size, since even with Vega, it was getting to the maximum Limits of the single interposer design. Edge Interposers would be sweet, but Intel owns the IP. I expect AMD could get hold of the IP at some cost, or exchange of IP with Intel. Thats a big Maybe. it all adds to the final cost though, and we all know that an extra $1 on the cost side, is going to cost the consumer waaaaaay more than that $1. DDR5 is quite a way away, i agree.
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
again, Infinity Fabric is the best interposer in existence for manufacturing. Intel isn't pursuing edge interposers like they are running after Infinity Fabric IP. their single biggest research cpu wise ATM is how to duplicate Infinity Fabric without stepping on AMD patents (esp. given past history). and guess what...there's an "Infinity Fabric 2" being worked on as we speak...though it's more like Infinity Fabric+
https://forums.guru3d.com/data/avatars/m/271/271877.jpg
Evildead666:

PCIe 4 will be arriving, and hanging around. there is no reason for it to be short lived. PCIe 4 can be done on AM4(+?), the number of Pins dosn't change, and it will be backward compatible with at least PCIe 3, if not all the way back. gen 5 has no info at all about even engineering samples being ready or tested. it will probably be "ready" and hanging around waiting for PCIe 4 to give up its seat 😉 In EPYC, intra-CPU Infinity Fabric is done over 64 PCIe 3 lanes. pop that to PCIe 4, and you can do the same with 32 lanes, or double the bandwidth. Just for that it's worth it. (I agree, same could be said for going to PCIe 5, but its not even close). I'm pretty sure the interposer isn't that cheap 😉 and its a pain to create at that size, since even with Vega, it was getting to the maximum Limits of the single interposer design. Edge Interposers would be sweet, but Intel owns the IP. I expect AMD could get hold of the IP at some cost, or exchange of IP with Intel. Thats a big Maybe. it all adds to the final cost though, and we all know that an extra $1 on the cost side, is going to cost the consumer waaaaaay more than that $1. DDR5 is quite a way away, i agree.
I read an interview where someone from AMD was asked about the PCIE lanes between the socket and the X370 chipset been so less, and he/she answered that they will double the bandwidth in a near future with an upgrade to PCIE 4.0 with the same quantity of lanes. So it would happen in the next release/s...
data/avatar/default/avatar21.webp
Evildead666:

PCIe 4 will be arriving, and hanging around. there is no reason for it to be short lived. PCIe 4 can be done on AM4(+?), the number of Pins dosn't change, and it will be backward compatible with at least PCIe 3, if not all the way back. gen 5 has no info at all about even engineering samples being ready or tested. it will probably be "ready" and hanging around waiting for PCIe 4 to give up its seat 😉 In EPYC, intra-CPU Infinity Fabric is done over 64 PCIe 3 lanes. pop that to PCIe 4, and you can do the same with 32 lanes, or double the bandwidth. Just for that it's worth it. (I agree, same could be said for going to PCIe 5, but its not even close). I'm pretty sure the interposer isn't that cheap 😉 and its a pain to create at that size, since even with Vega, it was getting to the maximum Limits of the single interposer design. Edge Interposers would be sweet, but Intel owns the IP. I expect AMD could get hold of the IP at some cost, or exchange of IP with Intel. Thats a big Maybe. it all adds to the final cost though, and we all know that an extra $1 on the cost side, is going to cost the consumer waaaaaay more than that $1. DDR5 is quite a way away, i agree.
Sure but point is, except IBM power9 noone use PCIE gen4 and gen5 is more or less complete and will be released next year, thats before we see any PCI-e gen4 motherboards except enterprise segment. As for interposer, it was said that one used on fury cost $25 to make plus, unless you want to add something like HBM to epyc it might not even need silicon interposer as you probably don't need that many connections between core and uncore .
https://forums.guru3d.com/data/avatars/m/197/197287.jpg
xrodney:

Sure but point is, except IBM power9 noone use PCIE gen4 and gen5 is more or less complete and will be released next year, thats before we see any PCI-e gen4 motherboards except enterprise segment. As for interposer, it was said that one used on fury cost $25 to make plus, unless you want to add something like HBM to epyc it might not even need silicon interposer as you probably don't need that many connections between core and uncore .
Being "standardized" and actually released to products are different things. PCI-E Gen 5 should be standardized in 2019. PCI-E Gen 4 was standardized in 2017. So if you expect companies to start making products in 2019 with PCI-E Gen 5, even though we have yet to get PCI-E Gen 4 items, i'm not sure where the logic is there. Now i do expect it to be short lived relatively thinking, i expect to there to be 2-3 years of PCI-E Gen 4 products before Gen 5, which is much less then the 6-8 years or so that PCI-E Gen 3 has lasted.
data/avatar/default/avatar08.webp
Aura89:

Being "standardized" and actually released to products are different things. PCI-E Gen 5 should be standardized in 2019. PCI-E Gen 4 was standardized in 2017. So if you expect companies to start making products in 2019 with PCI-E Gen 5, even though we have yet to get PCI-E Gen 4 items, i'm not sure where the logic is there. Now i do expect it to be short lived relatively thinking, i expect to there to be 2-3 years of PCI-E Gen 4 products before Gen 5, which is much less then the 6-8 years or so that PCI-E Gen 3 has lasted.
All previous PCI-e standards were used in products within year from release and they actually start already working on them once they reach version 0.7 which is already where gen5 stands right now. There are unlikely be any functionality changes done, just small tweaking and ensuring backward compatibility is where it stands atm. AMD itself said that they will bring new PCI-e in 2020, there will be no Gen4 in 2019, at least not on desktop. So unless gen5 is extremely hard/expensive it does not make sense to use gen4 when gen5 is already out. It might make sense to make new motherboards to last 1 year before they bring something else, they do it all the time, but I dont think AMD would want to go this way and they would neither want to wait several years to replace gen4 with gen5. So it all depends, but for AMD implementing Gen5 could gain a lot for their IF (internal/external core to core bandwidth increase and latency reduction) as they are using same links for either PCI-e or IF functionality.
https://forums.guru3d.com/data/avatars/m/197/197287.jpg
xrodney:

AMD itself said that they will bring new PCI-e in 2020, there will be no Gen4 in 2019, at least not on desktop.
Why? There's nothing stopping AMD from having CPUs that work with both PCI-Express 3.0 and 4.0, which is the only reason i can think that you would say they wouldn't be on desktop, is the idea that a new CPU would require a new socket so soon. There's nothing stopping AMD from creating Zen2 CPUs with a x500 chipset and PCI-Express 4.0 support, and also allowing the CPU from working in x300 and x400 chipsets as well.
data/avatar/default/avatar04.webp
Aura89:

Why? There's nothing stopping AMD from having CPUs that work with both PCI-Express 3.0 and 4.0, which is the only reason i can think that you would say they wouldn't be on desktop, is the idea that a new CPU would require a new socket so soon. There's nothing stopping AMD from creating Zen2 CPUs with a x500 chipset and PCI-Express 4.0 support, and also allowing the CPU from working in x300 and x400 chipsets as well.
I am not talking about Gen3, I am talking about Gen4 and Gen5 and why it would make more sense to skip Gen4 and go for Gen5 directly (if cost of implementation is not prohibitive here) instead of doing both soon after each other. And as I mentioned AMD stated that they are bringing new PCI-e in 2020, so... unless something changed its not coming in 2019 with Zen2.
https://forums.guru3d.com/data/avatars/m/197/197287.jpg
xrodney:

And as I mentioned AMD stated that they are bringing new PCI-e in 2020,
Don't know where you keep getting this information. Everything states AMD has targeted PCI-Express 4.0 for 2020. Again, your logic that PCI-Express 5.0 will be here so quickly when it's taken this long for PCI-Express 4.0 to come out...doesn't make sense. Do you have some sort of inside knowledge as to why PCI-Express 4.0 was released almost 2 years ago but has not come out for the public use? Do you have some sort of inside knowledge that proves PCI-Express 4.0 is harder to implement then 5.0? Where is your logic coming from? What could possibly be so hard to implement 4.0, yet so easy and fast to implement 5.0? Realistically if we want to go on speculations, it looks like AMD, from the pieces of information here and there, may release a GPU in 2020 with PCI-Express 4.0 support. Now, if you release a GPU, you'd likely want to have boards already available to use right? So it's not unlikely that PCI-Express 4.0 will be in the next AMD chipset with Zen2, but since there won't be any products until 2020 to actually take use of them, then the "whole package" will not be here until 2020. Or, they could wait for both till 2020, and Zen2+/3 and their respective chipsets will have 4.0 support. But all that's speculation. The only thing we have been told is that AMD will bring PCI-Express 4.0 in 2020, nothing about 5.0.