DirectX 12's Latest Update Offers Shared Pool of VRAM to CPU & GPU for Improved Performance in Games.

schmidtbag

2023-04-03 15:42

I'm not understanding how DirectX benefits from this. For one thing, isn't rBAR done at the BIOS and driver level? I get the idea of being able to more transparently read from VRAM, but isn't that done at the OS level? I just don't get what exactly DX needed to achieve this. But also, to my understanding, the only reason the CPU would want to collect large amounts of GPU data is for compute purposes, which DX isn't really used for. To clarify, I definitely see the benefit in the CPU having a shared pool with VRAM, I just don't get what DX has anything to do with it.

#6118362

demented brave

2023-04-03 15:45

Would this have any benefit for existing games or just ones developed around it?

#6118369

asturur

2023-04-03 15:57

schmidtbag:

I'm not understanding how DirectX benefits from this. For one thing, isn't rBAR done at the BIOS and driver level? I get the idea of being able to more transparently read from VRAM, but isn't that done at the OS level? I just don't get what exactly DX needed to achieve this. But also, to my understanding, the only reason the CPU would want to collect large amounts of GPU data is for compute purposes, which DX isn't really used for. To clarify, I definitely see the benefit in the CPU having a shared pool with VRAM, I just don't get what DX has anything to do with it.

I think what rbar is doing is letting copy memory from ram to gpu ram in chunks larger than 256MB at time. While from the description this seems like there i a portion of gddr6 where the cpu can directly write and read to/from. I see how this would completely jump one step of first writing into system memory and then in gpu memory, but also i don't think this will be anyway faster for cpu calculation. This is entirely to feed the gpu faster and faster imho

#6118371

schmidtbag

2023-04-03 16:02

asturur:

I think what rbar is doing is letting copy memory from ram to gpu ram in chunks larger than 256MB at time. While from the description this seems like there i a portion of gddr6 where the cpu can directly write and read to/from. I see how this would completely jump one step of first writing into system memory and then in gpu memory, but also i don't think this will be anyway faster for cpu calculation. This is entirely to feed the gpu faster and faster imho

Read/write directly to, as opposed to what? Like bypass the GPU altogether? Because I guess I could see that being beneficial; I never really thought of the GPU taking in data and filling up its own VRAM as being a bottleneck [with rBAR] but perhaps it is.

#6118390

ivanosky

2023-04-03 17:05

schmidtbag:

Read/write directly to, as opposed to what? Like bypass the GPU altogether? Because I guess I could see that being beneficial; I never really thought of the GPU taking in data and filling up its own VRAM as being a bottleneck [with rBAR] but perhaps it is.

The GPU doesn't have access to System RAM, in order for the GPU to access any data, that data needs to be loaded on the VRAM by the CPU. Previously, any data that needed to go to the GPU was first loaded into System RAM, and then copied to VRAM, at 256 MB chunks. ReBAR allowed to send larger amounts of data at a time, but it still needed to be loaded into System RAM, and then copied to VRAM. With this new change, the step of loading data to System RAM is no longer required, the CPU can load the data directly to VRAM, which reduces latency and RAM amount requirements for games. It would be similar to how consoles work, where CPU and GPU have access to the same shared memory.

#6118392

illrigger

2023-04-03 17:08

schmidtbag:

I just don't get what DX has anything to do with it.

You're thinking too hard about it. DirectX is an API, it allows programmers to access certain functions of hardware. DirectX and Vulcan are APIs for the hardware layer related to graphics, so any function that has to do with the GPU is bundled in them.

#6118393

TLD LARS

2023-04-03 17:10

Nice new feature, but the transfer itself still takes up CPU, memory and GPU time to transfer the data, so this needs software to monitor and make decisions depending on how busy the components in the chain are. A fully loaded 4-6 core should not try and offload a 4080 GPU and a low memory 8GB 3070ti should not offload a 13900k that is mostly sleeping in games anyway. This feels a bit like the Intel E-cores, sometimes they kick ass and sometimes it is better to let them sleep. Maybe a revival of my Vega 64 to be CPU support only, if the PCI 3 is even fast enough for it to make sense.

#6118396

schmidtbag

2023-04-03 17:18

ivanosky:

With this new change, the step of loading it to System RAM is no longer required, the CPU can load the data directly to VRAM, which reduces latency and RAM amount requirements for games. It would be similar to how consoles work, where CPU and GPU have access to the same shared memory.

If I understand you correctly, what you're saying is the CPU acquires data from a drive and sends it directly to VRAM, bypassing RAM (and the GPU) altogether. I don't understand how that's physically possible. For the CPU to do the work itself, it must read data from RAM. DirectStorage is more similar to what consoles do, where the GPU feeds data from storage directly, no CPU or DRAM required. This is not DS, . What I understand this feature to do is to bypass the GPU, where VRAM is directly shared by both the CPU and GPU. If this is true, I'm just surprised the GPU would be that much of a bottleneck.

illrigger:

DirectX and Vulcan are APIs for the hardware layer related to graphics, so any function that has to do with the GPU is bundled in them.

That's precisely what confuses me though, because it seems to me this feature would be done at a lower level than the API.

#6118400

mbk1969

2023-04-03 17:29

schmidtbag:

I'm not understanding how DirectX benefits from this. For one thing, isn't rBAR done at the BIOS and driver level? I get the idea of being able to more transparently read from VRAM, but isn't that done at the OS level? I just don't get what exactly DX needed to achieve this. But also, to my understanding, the only reason the CPU would want to collect large amounts of GPU data is for compute purposes, which DX isn't really used for. To clarify, I definitely see the benefit in the CPU having a shared pool with VRAM, I just don't get what DX has anything to do with it.

Where previously game engine (DX) should allocate chunk of memory in heap (RAM) and then copy it to GPU now it can allocate chunk of memory right in GPU memory and does not need to copy it later to GPU. Maybe it can speed up operations like simple texture loading. Maybe it can`t speed up complex multistep operations with frame because it is not a fact that such GPU memory access is faster than usual RAM access. Benefits not in DX, benefits could be in game engines.

#6118411

icedman

2023-04-03 17:58

I wonder if it would be more efficient to add extra gddr6 sticks of ram to a mobo to assist with both gpu and cpu tasks as a shared pool. Obviously there would be a latency penalty but picture having 8-32gb of gddr6 sitting between the cpu and gpu as a reserve especially with nvidias limited amounts on the 3k and now 4k series.

#6118415

ivanosky

2023-04-03 18:01

schmidtbag:

If I understand you correctly, what you're saying is the CPU acquires data from a drive and sends it directly to VRAM, bypassing RAM (and the GPU) altogether. I don't understand how that's physically possible. For the CPU to do the work itself, it must read data from RAM. DirectStorage is more similar to what consoles do, where the GPU feeds data from storage directly, no CPU or DRAM required. This is not DS, . What I understand this feature to do is to bypass the GPU, where VRAM is directly shared by both the CPU and GPU. If this is true, I'm just surprised the GPU would be that much of a bottleneck. That's precisely what confuses me though, because it seems to me this feature would be done at a lower level than the API.

It would be bypassing the System RAM, not the GPU. The GPU doesn't have access to either System RAM or Storage. Physically the GPU is connected to the CPU through the PCI-E bus, the CPU is the one that is connected to the Storage, through the PCI-E bus on another lane (if NVMe), and to the System RAM through the System BUS. The GPU and Storage are not physically connected, so any data that goes from one to the other needs to pass through the CPU. The bottleneck was System RAM.

#6118433

TimmyP

2023-04-03 18:44

The problems with most new games might be because they are ported from PS5 that uses shared memory, and this inclusion would hopefully alleviate this.

#6118533

ThermaL1102

2023-04-03 22:55

anybody have any idea when we are getting this new version ? there never is a date to these posts , when ever something new comes out . it's nice to know we're even getting a new version , but if there's no date , there's no real point to a post like this , is there ?

#6118670

SniperX

2023-04-04 10:37

Unless.....you're running a GTX 970 😀 🙄

#6118786

mentor07825

2023-04-04 17:01

ThermaL1102:

anybody have any idea when we are getting this new version ? there never is a date to these posts , when ever something new comes out . it's nice to know we're even getting a new version , but if there's no date , there's no real point to a post like this , is there ?

The point is that we can see hardware and API developments in the field and that we are knowledgeable enough to recognise that developments are taking place and where the wind is blowing. That way, in our future, we can make informed decisions on what to buy based off of this. Presume even that some of the site's visitors are developers themselves. Knowing what has released in their branch is a boon.