AMD is looking into chiplets to connect future CPUs and GPUs

Published by

Click here to post a comment for AMD is looking into chiplets to connect future CPUs and GPUs on our message forum
https://forums.guru3d.com/data/avatars/m/196/196426.jpg
And so the cycle repeats... 1) First computers were made out of individual components (vacuum tubes and relays, then transistors) 2) Many of these transistors eventually got integrated into "integrated circuits", which live to this day (a CPU or GPU is still an I.C.) 3) Eventually we learned to put so much stuff on one single I.C. that it became the computer itself (System-on-a-Chip, or SoC) 4) But we're reaching the limit of how much stuff can we put on one single chip, so... it's getting split into these "chiplets" Future: 5) Eventually with new technology (beyond silicon) it will be possible to put so many "chiplets" on one interposer, as it will become an "integrated chiplet array" or some weird name like that, no different than one I.C. of today. 6) Even more into the future *, we'll be able to stack chiplets and cooling inside the same "array", resulting in gigantic computing power in the shape of a cube of sorts, cooled by water flowing through the chip itself. * This has already been experimented by IBM, but never produced on a large scale (too expensive).
https://forums.guru3d.com/data/avatars/m/63/63170.jpg
Exciting times ahead, for sure 😉
https://forums.guru3d.com/data/avatars/m/250/250418.jpg
rav555:

This is actually rather old news. Last year AMD published a White Paper along with GH Loh and others titled: "Design and Analysis of an APU for Exascale Computing". It goes on in detail to describe the Multi Chip Module for the Exascale Node Architecture. A pdf can be found here: http://www.computermachines.org/joe/publications/pdfs/hpca2017_exascale_apu.pdf
Nice way to make your first post.
data/avatar/default/avatar26.webp
Future looks bright. [youtube=u5s0I_SXDpQ]
https://forums.guru3d.com/data/avatars/m/152/152580.jpg
rav555:

This is actually rather old news. Last year AMD published a White Paper along with GH Loh and others titled: "Design and Analysis of an APU for Exascale Computing". It goes on in detail to describe the Multi Chip Module for the Exascale Node Architecture. A pdf can be found here: http://www.computermachines.org/joe/publications/pdfs/hpca2017_exascale_apu.pdf
News refers to resolving the problem of deadlocks. Anyway, a very interesting article. It becomes clear why AMD invested heavily in HSA, HBM and interposers . All these (and several other) AMD technologies converge in the chiplets.
https://forums.guru3d.com/data/avatars/m/272/272452.jpg
AMD seems to be playing the long game, while Intel wallows is the mess they have created by being complacent due to no competition.
https://forums.guru3d.com/data/avatars/m/248/248627.jpg
I had a feeling this would happen as soon as AMD started using an interposer on fury
https://forums.guru3d.com/data/avatars/m/243/243702.jpg
I was talking about this years ago. And then again and again. Finally road is open. Computer sized as 2.5'' HDD, just bit thicker due to cooling. Strong computers sized as graphics cards today. There is just need to make new form factor standards.
https://forums.guru3d.com/data/avatars/m/267/267787.jpg
This sounds very interesting and the good thing is the small overall size with all the performance. However I’m very curious to see how on earth they will cool this big slice of silicone... Vega is already showing how hard it is to cool GPU with HBM... Imagine a CPU on the same slice of silicone... hmmmm....
https://forums.guru3d.com/data/avatars/m/258/258664.jpg
RooiKreef:

This sounds very interesting and the good thing is the small overall size with all the performance. However I’m very curious to see how on earth they will cool this big slice of silicone... Vega is already showing how hard it is to cool GPU with HBM... Imagine a CPU on the same slice of silicone... hmmmm....
I think the issue here is stacking, but honestly, they should just come up with something new. Maybe AIO cooling will become a standard down the road, maybe they have any better chances with phase change cooling, but yeah putting stuff together in a tighter space never helps cooling.
https://forums.guru3d.com/data/avatars/m/267/267787.jpg
fantaskarsef:

I think the issue here is stacking, but honestly, they should just come up with something new. Maybe AIO cooling will become a standard down the road, maybe they have any better chances with phase change cooling, but yeah putting stuff together in a tighter space never helps cooling.
Yea, like you say AIO is definitely the way to go with something like this. Especially in the high performance class.
https://forums.guru3d.com/data/avatars/m/243/243702.jpg
RooiKreef:

This sounds very interesting and the good thing is the small overall size with all the performance. However I’m very curious to see how on earth they will cool this big slice of silicone... Vega is already showing how hard it is to cool GPU with HBM... Imagine a CPU on the same slice of silicone... hmmmm....
Guys, please learn a bit of electrical engineering. You can have Vega 64 eating 450W or 120W. It is all question of power targets, achievable clock and voltages. (If you have hard time to imagine it, look at Raven Ridge APUs, 2400G = 2700U, just differently configured and that thing ranges from 15 to 95W under full load, just by changing clock and voltage targets.) MCM designs were always about stopping use of large single dies. As they always hit wall of achievable size, and then they are forced to run higher clock => higher power draw. MCM is about getting more performance at cost of building on larger area than monolithic chip, and clocking lower to achieve similar power target. This costs you more as results, but you can get much more performance per watt. Imagine 250W 1080Ti. Now imagine 2x 1080Ti, each 125W. While in gaming, it may have questionable improvement (SLi/CF support), in compute you get 50% more performance per watt easily. On other hand, if large die has some unreasonable defect rate, building 4 times as many dies of 1/4 size makes final product cheaper. On AMD's GPU side, I liked their building blocks because it would not be that hard to run them as separate dies on interposer while behaving as one single GPU. Unfortunately, AMD had no reason to go that route with GCN. Because they managed to make single die GPU w/ 4096SPs which was original GCN limitation. (I think it is bit sad limit since 1st GPU GCN had came already with 2048SPs.)
https://forums.guru3d.com/data/avatars/m/258/258664.jpg
Fox2232:

Guys, please learn a bit of electrical engineering. You can have Vega 64 eating 450W or 120W. It is all question of power targets, achievable clock and voltages. (If you have hard time to imagine it, look at Raven Ridge APUs, 2400G = 2700U, just differently configured and that thing ranges from 15 to 95W under full load, just by changing clock and voltage targets.) MCM designs were always about stopping use of large single dies. As they always hit wall of achievable size, and then they are forced to run higher clock => higher power draw. MCM is about getting more performance at cost of building on larger area than monolithic chip, and clocking lower to achieve similar power target. This costs you more as results, but you can get much more performance per watt. Imagine 250W 1080Ti. Now imagine 2x 1080Ti, each 125W. While in gaming, it may have questionable improvement (SLi/CF support), in compute you get 50% more performance per watt easily. On other hand, if large die has some unreasonable defect rate, building 4 times as many dies of 1/4 size makes final product cheaper. On AMD's GPU side, I liked their building blocks because it would not be that hard to run them as separate dies on interposer while behaving as one single GPU. Unfortunately, AMD had no reason to go that route with GCN. Because they managed to make single die GPU w/ 4096SPs which was original GCN limitation. (I think it is bit sad limit since 1st GPU GCN had came already with 2048SPs.)
Process nodes getting smaller, more transistors per mm², stacked chips, even with the reduction in TDP and wattage, at some point you can't just put a fan next to it to blow onto it. Maybe not in our consumer PCs, but at some point they have to come up with something that works better than standard air cooling, can't imagine a car's engine compartment has lower ambient temps than my living room, for instance, when you do automated driving in summer months.
https://forums.guru3d.com/data/avatars/m/243/243702.jpg
fantaskarsef:

Process nodes getting smaller, more transistors per mm², stacked chips, even with the reduction in TDP and wattage, at some point you can't just put a fan next to it to blow onto it. Maybe not in our consumer PCs, but at some point they have to come up with something that works better than standard air cooling, can't imagine a car's engine compartment has lower ambient temps than my living room, for instance, when you do automated driving in summer months.
That's the thing. When you build 5B transistor chip on 28nm and it eats 300W, you can cool it easily. Then you shrink same 5B chip to 14nm and clock it higher, to point it eats again 300W. You still manage to cool it down somehow. But going to 7nm, you can't no longer get same 5B chip eat 300W, as cost of cooling would be unacceptable. That's why you go, and build four of those 5B 7nm chips. Clock them in way that they eat together 300W. And extract much more performance of them as whole. Costs more to make, but that's how things are. At some point way forward is power efficiency.
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
Fox2232:

That's the thing. When you build 5B transistor chip on 28nm and it eats 300W, you can cool it easily. Then you shrink same 5B chip to 14nm and clock it higher, to point it eats again 300W. You still manage to cool it down somehow. But going to 7nm, you can't no longer get same 5B chip eat 300W, as cost of cooling would be unacceptable. That's why you go, and build four of those 5B 7nm chips. Clock them in way that they eat together 300W. And extract much more performance of them as whole. Costs more to make, but that's how things are. At some point way forward is power efficiency.
well said. cheers
https://forums.guru3d.com/data/avatars/m/196/196284.jpg
fantaskarsef:

Process nodes getting smaller, more transistors per mm², stacked chips, even with the reduction in TDP and wattage, at some point you can't just put a fan next to it to blow onto it. Maybe not in our consumer PCs, but at some point they have to come up with something that works better than standard air cooling, can't imagine a car's engine compartment has lower ambient temps than my living room, for instance, when you do automated driving in summer months.
A car's engine compartment gets insanely hot. A car engine is designed to operate in the temperature range of 195-225 Fahrenheit (90-107 Celsius) though. In the consumer market, liquid cooling is the best you're going to get for the foreseeable future. There were phase-change kits available in the past, but they were very cost prohibitive for most people. There was also the risk of releasing toxic gas into the air if the phase-change unit developed a leak. The units I've seen were all based on R-12 Refrigerant. Unfortunately, neither C134, R-134 or R-134 have the cooling capacity of R-12 and don't work well under the same conditions. An old R-12 based phase-change cooling system would have been relatively easy to hide in Intel's recent demo compared to the R-134A based unit they did use.... The new R-1234yf Refrigerant would be a very poor candidate for phase-change cooling in a PC due to being extremely combustible... So, for now, we're left with just liquid cooling. However, you could always get a mini-frig to house your radiator, pump and fans and run tubing over to your computer....lol