Rumors Suggest Nvidia Next-Gen Blackwell GPUs Will Adopt a Multi-Chiplet Design

Published by

Click here to post a comment for Rumors Suggest Nvidia Next-Gen Blackwell GPUs Will Adopt a Multi-Chiplet Design on our message forum
data/avatar/default/avatar07.webp
The problem with chiplet designs is "how do you make interconnects between the separate parts efficient". If you can't solve that adequately the performance loss due to splitting your GPU in pieces and having longer paths between functional units doesn't outweigh the advantages from having easier to build chiplet designs. Look at the process AMD went through with Zen in order for them to finally arrive at a chiplet design that would still be efficient enough to compete with Intel's monolithic designs, yet give them the flexibility to scale up by using more chiplets. That took them basically three generations of ZEN to get right. I guess that behind the scenes NVIDIA has been working on MCM designs for quite some time, but maybe for Blackwell they've finally managed to get a design down that is efficient enough to compete with the monolithic design that would be its alternative. That or they've truly run out of options for what they want to do with Blackwell in a monolithic design and they have to switch to chiplet. In a way the Grace/Hopper and Grace/Grace modules can be seen as a first attempt at MCM design as they do combine a Grace CPU and Hopper GPU or two Grace CPUs inside a single module by directly connecting these separate dies together.
https://forums.guru3d.com/data/avatars/m/275/275921.jpg
Looking forward to more info 🙂
https://forums.guru3d.com/data/avatars/m/246/246171.jpg
I feel like Nvidia would have an easier time doing chiplets than AMD considering Nvidia's got a lot more die space dedicated to specific features.
https://forums.guru3d.com/data/avatars/m/165/165326.jpg
I believe chiplet is the future as moore's law is approaching and soon thay will run out of shrinking nodes.
https://forums.guru3d.com/data/avatars/m/220/220214.jpg
It is the only future, but, good luck to the NVidia game driver team and game devs trying to fix problems with their existing games running on these! Going to be a nightmare I'd say.
data/avatar/default/avatar35.webp
geogan:

It is the only future, but, good luck to the NVidia game driver team and game devs trying to fix problems with their existing games running on these! Going to be a nightmare I'd say.
Exactly. I'm guessing that for NV, MCM will probably more feasible for HPC, datacenter and AI related SKUs rather than gaming gpus.
https://forums.guru3d.com/data/avatars/m/80/80129.jpg
barbacot:

So he changed his mind... https://i.imgur.com/Dgxswad.png
I'm not saying he's right or wrong but Blackwell is a replacement for Hopper. ADA-Next is a replacement for ADA. They aren't (as far as we know) the same chip.
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
Denial:

I'm not saying he's right or wrong but Blackwell is a replacement for Hopper. ADA-Next is a replacement for ADA. They aren't (as far as we know) the same chip.
they're not even the A100 is technically a mcm design using the next gen NVlink the difference being that the A100 is using an approach more like Apple's by "stitching" together monolithic chips the Blackwell approach is like AMD's... which even Intel is moving towards due to the massive savings of a far higher yield process... that also allows for die shrinks where applicable and larger (cheaper) nodes in the areas where the node shrink has less benefit
https://forums.guru3d.com/data/avatars/m/186/186805.jpg
Interesting to see how each company goes about this type of solution. Intel has one in the works too and this video shows AMD's recent released patent on the same thing. Navi4X was rumoured to have no high end part with only N43/4 being developed. But it plays right into AMD's previous strategies with smaller dies but making them work together. [youtube=QWaUuMVpY6o] Extremely interesting time in tech.
https://forums.guru3d.com/data/avatars/m/181/181063.jpg
Denial:

I'm not saying he's right or wrong but Blackwell is a replacement for Hopper. ADA-Next is a replacement for ADA. They aren't (as far as we know) the same chip.
https://wccftech.com/nvidia-geforce-rtx-50-blackwell-gpu-rumors-3nm-monolithic-faster-clocks-over-2x-faster-than-ada-rtx-40/ He was referring to Blackwell when he said "ADA next"...anyway this guy is throwing rumors after rumors and the laws of probability state that at some point he will be right:p
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
my biggest surprise with the RDNA 3 on launch was they didn't go to a larger socket (like TR or Epyc) if they did so they could've easily outdone the 4090 as their mcm design is truly scalable - even the heatsink issue has long been resolved and the market has little resistance to aio's if they wanted to keep to 2-3 slot widths of the cards
https://forums.guru3d.com/data/avatars/m/186/186805.jpg
tunejunky:

my biggest surprise with the RDNA 3 on launch was they didn't go to a larger socket (like TR or Epyc) if they did so they could've easily outdone the 4090 as their mcm design is truly scalable - even the heatsink issue has long been resolved and the market has little resistance to aio's if they wanted to keep to 2-3 slot widths of the cards
Eh? Larger socket on a GPU? TR and Epyc are CPU's? What do you mean exactly? Do you mean a larger die?
https://forums.guru3d.com/data/avatars/m/271/271560.jpg
CPC_RedDawn:

Eh? Larger socket on a GPU? TR and Epyc are CPU's? What do you mean exactly? Do you mean a larger die?
yes, a larger die occupying a larger socket with a higher pin count enabling a larger count of chiplets. yes, TR and Epyc are cpu's TR & Epyc are brought up because there is an existing infrastructure of suppliers ( esp. wire bundling) that can be immediately used. because AMD's designs are truly scalable they can theoretically double the chiplet count, of say a 7900xtx, while lowering the frequency they operate at (to manage power/heat) for vastly higher performance while utilizing a socket size that has existing cooling solutions (to save end user costs) esp. clc's (to save mobo slots). but even air cooling has been shown to be effective but i'm no fan of 4 slot solutions.