Nvidia might be moving to Multi-Chip-Module GPU design
With Moore's law becoming more difficult each year technology is bound to change. At one point it will be impossible to shrink transistors even further, hence companies like Nvidia already are thinking about new methodologies and technologies to adapt to that. Meet the Multi-Chip-Module GPU design.
Nvidia published a paper that shows how they can connect multiple parts (GPU modules) with an interconnect. According to the research, this will allow for bigger GPUs with more processing power. Not only will is help tackling the common problems, it would also be cheaper to achieve as fabbing four dies that you connect is cheaper to do than to make one huge monolithic design.
Thinking about it, AMD is doing exactly this with Threadripper and EPYC processors where they basically connect two to four Summit Ridge (ZEN) dies with that wide PCIe lane link (they use 64 PCie lanes per link with 128 available), Infinity Fabric.
According to the researchers, as an example a GPU with four GPU modules they recommend three architecture optimizations that will allow for minimal loss off data-communication in-between the different modules. According to the paper the loss in performance compared to a monolithic single die chip would be merely 10%
Of course when you think about it, in essence SLI is already a similar methodology (not technology), however as you guys know it can be rather inefficient and challenging in scaling and compatibility. The paper states this MCM design would be performing 26.8% better compared to any multi-GPU solution. If and when Nvidia is going to fab MCM multi GPU module based chips is not known, for now this is just a paper on the topic. The fact that they publish it indicates it is bound to happen at one point in time though.
Sorry, I could not resist ... ;)
Rumor: Nvidia Mobile Pascal-GPUs during Computex 2016 - not Desktop - 02/26/2016 01:31 PM
It's been topic for discussion for a while now. Personally I think we'll see some soft of announcement in April during the GTC, and later on a broad announcement in the Computex timeframe. Likely, a...
Nvidia might release GTX 980MX and 970MX for laptops - 01/19/2016 10:26 AM
Nvidia is likely planning the successors to the 970M and 980M in the 2nd half of 2016. The successors will be the 980MX and 970MX and are based on GM204....
Nvidia might be working on their own VR headset - 06/07/2015 01:03 PM
Nvidia might release their own VR headset, this is now speculated as information surfaced showing that Nvidia holds a patent for a headset with six camera and two displays, each for one eye....
Microsoft Confirms DirectX 12 MIX AMD and Nvidia Multi-GPUs - 03/13/2015 08:38 AM
It's not exactly new news, but Microsoft actually confirms it at this stage. Microsoft technical support states that DirectX 12 will support “multi-GPU configurations between Nvidia and AM...
Nvidia Maxwell GM200 Flagship GPU caught on photo - 01/16/2015 04:10 PM
NVIDIA’s flagship Maxwell GPU (to be released) GM200 GPU core has been spotted and somebody took a photo. The product will end up in Quadro and hopefully GeForce graphics cards. The card tha...
Senior Member
Posts: 226
Joined: 2015-01-28
I think game devs waitig till someone do their hardest jobs.
Like Microsoft do DirectX for "them". Like nVidia "help" with gameworks.
Like Unreal Engine do engines etc.. etc.. Thats why I call game developers lazy actually

Multithreads in GPU tasks make sense. CPU arhitecture already work on multitasking.
Senior Member
Posts: 298
Joined: 2011-03-10
They were right!
Senior Member
Posts: 14092
Joined: 2004-05-16
I think game devs waitig till someone do their hardest jobs.
Like Microsoft do DirectX for "them". Like nVidia "help" with gameworks.
Like Unreal Engine do engines etc.. etc.. Thats why I call game developers lazy actually

Multithreads in GPU tasks make sense. CPU arhitecture already work on multitasking.
Game developers are not lazy. During crunch time on a game, they often work 6-7 days a week for 12+ hours a day for several months.
http://kotaku.com/crunch-time-why-game-developers-work-such-insane-hours-1704744577
That's not to mention that there are only handful of developers with the knowledge of being able to develop low level APIs, low level engine systems, network code that can scale across thousands of servers and clients, etc. They don't teach any of that stuff in game design school - they teach you like LUA scripting and some C++/Java. The real good work comes from people specialized in certain fields, for example network engineering, that happen to take interest in gaming.
So when Unreal developers, or Nvidia with Gameworks, or AMD with GPUOpen come in and build out a bunch of libraries for developers, it's extremely helpful. It shouldn't reflect on the developers that utilize it.
Honestly the series of videos that Star Citizen has been putting out lately provides excellent insight into what it takes to build and scale a game out over multiple studios. They show you how they have to build a production pipeline with a few extremely talented people before they think about hiring a mass of artists and designers for content check-in. Just scheduling and bringing new hires up to speed on the engine, scripting, design of races/ships/etc takes months.
I would argue that the level of production/talent/work in modern AAA games probably exceeds what most big budget movie studios are doing.
Senior Member
Posts: 390
Joined: 2017-06-09
Intel seems to be slowing heading in this direction too. AMD woke the sleeping giant. Current CPUs coming out are old tech.
I'm very interested to see what's coming in 2 years when we're on 10/7nm tech.
I'll still get a Volta Titan or Ti when it's out, but I'm guessing the first card to use this tech will be whatever comes after Volta. Exciting times coming in 2020...
x86 CPUs havent changed much, but there are still extremely advanced SPARC, and now ARM CPUs being made. The SPARC M7 and SPARC XIfx come to mind as two of the most advanced CPUs around, and we are seeing huge performance increaces from generation to generation in other architectures besides x86. NEC is even announcing a new vector CPU today.
Memory bandwidth and capacity, as well as data locality are goimg to be the next big things to focus on, because shrinking transistors isnt the way it used to be. Going smaller isnt just positives any more.
Senior Member
Posts: 2488
Joined: 2016-01-29
I'm pretty sure Infinity Fabric uses PCIe lanes for communication, maybe it can use other transports as well.
Between the CPU's on an Epyc chip, there are 64PCIe lanes going in between each cpu, if i read the slides correctly.
They can cut the latencies due to the short hops in between the on-chip cpu's, and the bandwidth should be plenty.
A GPU that has 2x16 PCIe lanes, could use the second set for intra-GPU signalling. Ideally you'd want 4 sets, like the North/South/East/West links on those DEC Alpha chips. That way, each GPU Die would be only one hop from any other, up to a certain number of dies.
The I.F. protocol can be implemented over different kinds of links, The Epyc and thread ripper MCM's dies are connected to each other over GMI links on the interposer (~42gb/s bidirectional perlink, each zepplin die has 4 GMI controllers), which are independent of the pcie controllers. I.F. runs over pcie lanes only in the Dual socket configuration of epyc, in that configuration it is known as xGMI.
A Gpu mcm by amd would probably use the same or similar GMI controllers.
Its possible that vega already has these since not much is known about the die and amd has stated they are using the I.F. on vega