When it comes to the graphics processing unit, things start to heat up. Even though the 58 billion transistors in AMD's new GPUs are not all in one spot, the company claims they are the fastest ever built. AMD quickly points out that these are the first gaming GPUs to use a chiplet architecture. Based on the company's experience developing chiplet-based Ryzen central processing units, AMD added two new key components to the RDNA 3 GPU: a GPU chiplet (GCD) and memory chiplets (MCD).
Six MCDs will surround this GCD on both cards. These are substantially smaller chips than the GCD, measuring 37mm2. These are constructed on TSMC's 6nm manufacturing node. Each of them has a 64-bit memory controller as well as a second-generation Infinity Cache. All six of these MCDs will work on the RX 7900 XTX. On the RX 7900 XT, only five of them will function. The sixth chip will remain, but AMD claims it is for manufacturing reasons (structural stability), which I assume means it is for the packaging process and possibly for optimal cooling. That means the 7900 XTX should have 96MB of Infinity Cache, whereas the 7900 XT has 80MB.
Navi 31 XTX with its new chiplet design
Radeon DNA 3 (rDNA3)
The Navi 31 GPU has made numerous appearances in leaks and rumours over the past year. This GPU is made in 5nm and has up to 12288 shader processors and 96MB of Infinity Cache. For the Radeon RX 7900 XTX, this graphics processor is stated to handle a 384-bit memory bus and capacities of up to 24GB of GDDR6 20 Gbps. AMD now touts multiple clock domains; for the XTX, that's 2.3 GHz for the shader clock frequency and 2.5 GHz for the front-end clock speed. That helps them save energy. RDNA 3 employs split clocks in the shader and front-end. According to the company's developers, the front-end was limiting gaming performance more than the shaders themselves. In RDNA 3, splitting the shader clock and the front-end clock allows for a faster front-end frequency of up to 2.5GHz and a slower shader rate of 2.3GHz. According to AMD, this amounts to a 15% frequency increase while preserving a 25% power savings.
The company will launch its high-end series based on the Navi 31 GPU, with Navi 32 and 33 following later. AMD's next-generation RDNA 3 GPUs will use WGP (Work Group Processors) instead of CU (Compute Units). Each WGP will have two CUs, but each CU will have four SIMD32 clusters instead of just two on each CU in RDNA 2. There are also AI units present; though we lack full information, these seem to be Tensor equivalents.
The RDNA 3 architecture of the AMD Navi 31 GPU has one GCD (Graphics compute die) based on Dual SIMD units, effectively doubling up shader count. The RX 7900 XTX features 96 CUs, each with 64 dual-issue processors. These processors can process two instructions down each data path, resulting in twice the instruction issue rate of RDNA 2. It is not two times faster; instructions can just be executed twice as quickly. Each CU also includes a pair of AI accelerators for operations like matrix multiplication, as well as a second-generation ray tracing accelerator. This RT accelerator, according to AMD, supports additional instructions and ray box sorting, resulting in up to 50% improved ray tracing performance per CU. Finally, each CU includes a Vector General Purpose Registry (VGPR), which is essentially where the CU's instructions are stored.
This will give 12,288 shader processors for the fully enabled GPU, also called stream processors. Compared to the 5120 SPs on the Navi 21 GPU, this is 2.4 times as many cores. The Navi 31 GPU will also have 6 MCDs (memory cache die aka infinity cache ships), each of which will have 16 MB of Infinity Cache and 64-bit (32-bit x2) memory controllers, which will give the chip a 384-bit bus interface. These are the chiplets, to one graphics die, and then infinity cache memory dies with controllers.
Next-generation raytracing cores should compensate for the performance that the 6000 lacked in Raytracing. Per CU, AMD claims 50% more performance.
Also update is the video or media engine, which will support AVC/HEVC simultaneous en and decode, ABV1 8k60 en/decode and AI-enhanced Video encode. on the output front, Displayport P 2.1 is supported with a display link bandwidth of up to 54 Gbps. But let's talk about the cards for a second. Next page, please.