NVIDIA Working on Tile-based Multi-GPU Rendering Technique Called CFR - Checkered Frame Rendering
Click here to post a comment for NVIDIA Working on Tile-based Multi-GPU Rendering Technique Called CFR - Checkered Frame Rendering on our message forum
Spets
Sounds like it can help run SLI on engines that use information from previous frames like temporal effects.
Dragam1337
With gpu improvements becoming smaller and smaller, we need SLI more than ever... so this makes me very happy !
cryohellinc
I think this all is heading towards chiplet design GPU's. In several years most like all of us will use SLI or a technology similar to it in one way or another.
asturur
i read that picture as rendering half pixel per frame.... unsure is an improvement.
Otherwise why writing frame N and N+1?
i disagree we need sli more than ever. We need a way to go back playing games with a decent amount of money.... And if sli is only for top cards that is not gonna happen.
geogan
I wonder how this gets around the problem of most modern game engines not being fully compatible with multi-gpu rendering (because of the way the engines work). I mean this was the reason that SLI support died off in last few years. It was a nightmare for developers and NVidia to try and shoehorn in support in a hacky way which ended up being no better and causing more trouble than it was worth.
As far as I can tell the only way this will work in future with real multi-gpu "chiplet" type designs... is an actual game engine designed from the beginning to work with multiple GPUs and their own RAM or most likely some form of multi-gpu with *shared* RAM solution (which would make the problems associated with multi-gpu game engine easier - i think the main problem is access to other frames to generate current frame but information is in different VRAM so requires totally inefficient copying across continuously)
So yes I think multi-GPU chiplet design would have to have a shared VRAM + cache amongst all GPUs - which is what SLI does not have now on seperate GPUs.
Astyanax
interesting that people are only noticing this now.... since its been in the NVAPI reference since august.
Reports are wrong as usual because it has a full opengl implementation including NVAPI reference values, and Pref exposed to Inspector (its not grouped wit hthe rest of the sli settings)
enum EValues_OGL_SLI_CFR_MODE
Enumerator
OGL_SLI_CFR_MODE_DISABLE
OGL_SLI_CFR_MODE_ENABLE
OGL_SLI_CFR_MODE_CLASSIC_SFR
OGL_SLI_CFR_MODE_NUM_VALUES
OGL_SLI_CFR_MODE_DEFAULT
https://cdn.discordapp.com/attachments/395665775077359626/647026061053263882/unknown.png
H83
Maybe i´m writing something really stupid because i have no expertise on this area but wouldn´t it be better to divide the workload differently between GPUs. Crysis for example, the GPU had to render everything including large portions of the island at a distance, is it possible to have one GPU rendering just the background and the physics and the other one running the rest of the scene??? This way the workload would be divided and the performance would increase. Of course this is a very simplistic approach and there are problems to solve like rendering everything in sync but still i wonder if this would work in real life.
If my sugestion is really stupid, don´t be afraid to say it guys!
DeskStar
DeskStar
Netherwind
I welcome this with open arms. Back in the day you bought one card and then another one when the next generation hit the shelves. Both cards combined were more powerful than the new generation flagship card and cheaper too.
Denial
https://en.wikipedia.org/wiki/Hydra_Engine)
There are a bunch of issues with it - for starters a ton of modern shaders in games use interframe data in order improve performance - if this data is sitting on another graphic cards then either the improvement can't be used or there is a massive performance penalty in getting it off that GPU onto the other GPU. Similarly, managing all these different elements as the scene shifts and recombining them in a single frame buffer for output takes time and thus effects performance. Managing the CPU threads to manage both GPUs is a nightmare too because you're essentially spending time before the scene even starts rendering figuring out how to divide the scene to avoid stalls across both GPUs. Then if the different GPUs have different feature sets it becomes even more complicated.. and it's all for what? So that the 25 people with SLI/Xfire can benefit slightly at the expense of everyone else because all the interframe optimization is now gone?
I think most devs, ones who are even capable of doing this kind of low-level hardware development, look at it and go "it's not even close to being worth it" and that's it.
This is basically what the Hydra Engine was by Lucid (schmidtbag
As far as I'm aware (and maybe I'm wrong), one of the main problems is that GPUs can only render 1 frame at a time. Each "stage" throughout the rendering process doesn't take up an equal amount of resources. So, although I think Nvidia's CFR idea is a good one, what if things were taken a step further, where any idle cores were used to calculate the next frame in parallel?
Although I don't know how GPUs work at the driver level, here's my very crude estimate of how each frame is rendered. Each "stage" may take up dozens of clock cycles:
1. The GPU receives, compiles, and parses new frame data to calculate
2. Calculate physics (if necessary)
3. Set up the mesh geometry to fit the viewport
4. Apply textures
5. Apply lighting effects
6. Perform ray tracing or calculate reflections
7. Run post-processing effects
8. Return any data back to the program that it may be expecting
Obviously, not all of these stages need the same amount of compute power. Some need fewer GPU cores than others. Some could get by with half-precision floats.
So, what if idle cores were always used to render the next frame? It's a similar idea to AFR, except instead of splitting up the entire frame rendering process per die, you split up the individual stages of frame rendering between individual cores. This should help reduce latency and maximize untapped resources.
EDIT: And this is different from H83's idea, which to my understanding, designates certain regions of the frame to be rendered on separate cores.
Of course, there must be some fundamental flaw in this idea (or, my assumption that only 1 frame is rendered at a time is wrong), or else it'd have been done already.
H83
fry178
@Netherwind
I dont remember what market you in or what cards you referred to, but during the time i worked in shops in germany/US, i have never seen 2 cards being faster than the top one, or they were not cheaper (e.g. two of the 2nd biggest chip), and most games would still not be running as smooth as with a single card (micro stutter) needed a bigger cpu and most of the time a bigger psu (vs biggest chip).
And if a new gen dropped, virtually all chips of the previous gen dropped in price, not just the smaller ones, so not an argument..
wavetrex
I wonder which software does tile-based rendering the best ?...
Something Cinema 4D, something Blender ...
Apply the same concept to multi-GPUs and RTX will get a LOT faster !
One GPU is pretty damn fast these days for classic raster, but it's down on it's knees with RTX ON...
Answer: RTX ON x2 (and of course $$$ x2, because why not)
Netherwind
Aekold
Dragam1337
A M D BugBear
Correct me if I am wrong here, Didn't ati in the past used similar method in xfire mode?
I thought they used this once before.
Dribble