Rumor: Nvidia Mobile Pascal-GPUs during Computex 2016 - not Desktop

kinggavin

2016-02-26 12:45

when i look at windows 10 with the store and the apps xbox app it makes me wonder if microsoft will come out with sme kind of windows 10 xbox-pc hybrid type gaming device which is portable with nvidia pascal gpus in it

#5236579

Fender178

2016-02-26 14:51

Hmm interesting. Well considering that this is a rumor we can wait and hope for more information and clarification from Nvidia. I hope we get more information about the desktop cards and whether or not they are going to have more than 4gb Vram for the 1070 and 1080.

#5236586

Undying

2016-02-26 14:58

Not long ago nvidia said that new mobile Maxwell based GPU's are coming, 970mx and 980mx. This seems just like the title said, a rumor and probably fake.

#5236588

Denial

2016-02-26 15:05

Not long ago nvidia said that new mobile Maxwell based GPU's are coming, 970mx and 980mx. This seems just like the tittle said, a rumor and probably fake.

Yeah idk, seems weird, I think mobile will come later. At CES Nvidia said that PX2 would start shipping early samples to select partners in Q2, will full availability in the second half of the year. The PX2 has a total TDP of 300w with 2 PASCAL GPU's on it. So those two pascal units are probably 120w each or something. I would imagine those exact units would show up on something, either desktop and or mobile before the PX2 ships. I personally think desktop. I think Hilbert's right about the timeframe. They'll probably paper launch in April and the cards won't show up till like June, July. I imagine Polaris will already be shipping by then.

#5236600

GeniusPr0

2016-02-26 15:19

Not long ago nvidia said that new mobile Maxwell based GPU's are coming, 970mx and 980mx. This seems just like the title said, a rumor and probably fake.

That rumor was debunked the next day. This rumor is likely true.

#5236614

HonoredShadow

2016-02-26 15:56

So is it true what some guys have been saying in the forum of late that these and the big day pascal are not going to be great at Direct X 12? Essentially they are Maxwell ++? If true I will wait till Volta and more DirectX 12 games.

#5236615

Denial

2016-02-26 15:58

So is it true what some guys have been saying in the forum of late that these and the big day pascal are not going to be great at Direct X 12? Essentially they are Maxwell ++? If true I will wait till Volta and more DirectX 12 games.

No one knows because no one knows anything about Pascal yet.

#5236617

HonoredShadow

2016-02-26 16:00

Well that's good to hear. Damn people and their opinions veiled as truth...

#5236620

Denial

2016-02-26 16:09

Well that's good to hear. Damn people and their opinions veiled as truth...

Well people are assuming that Pascal was already in it's design phase when the whole Async problem hit, so based on that chances are it might not be fixed. There is nothing concrete yet as to whether its fixed or not, if Nvidia even needs to make a change (Nvidia is claiming they can fix it with drivers), or if it even matters in terms of overall performance. There are two DX12 benchmarks out, both feature Async, The Fury X wins one, the 980Ti wins the other. People are trying glean information from both those benchmarks, but it's essentially pointless until there are more of them, including actual games.

#5236629

HonoredShadow

2016-02-26 16:51

Yer I been keeping up to date. One thing I don't understand is that if DX12 is meant to be less reliant on drivers how come we need a driver fix for Async to fix it?

#5236635

Ieldra

2016-02-26 17:01

So is it true what some guys have been saying in the forum of late that these and the big day pascal are not going to be great at Direct X 12? Essentially they are Maxwell ++? If true I will wait till Volta and more DirectX 12 games.

On top of nobody knowing much about pascal yet, except that it's much faster for neural network applications, we don't know much about DX12 performance yet. There have been two benchmarks, two. This is one : http://www.anandtech.com/show/9659/fable-legends-directx-12-benchmark-analysis/2 This is the other: http://anandtech.com/show/10067/ashes-of-the-singularity-revisited-beta/6 (the recent beta, not the old) asynchronous compute exists to take advantage of idle hardware, in simple terms - if you turn it off and the code performs better, you should probably keep it off. Contrary to popular belief 'async compute effects' don't exist, you're still running those operations on the same hardware; as somebody said "GPUs are already embarrassingly parallel" - the main advantage of 'async', from what I've been able to glean, is concurrency; ie: executing an operation (eg timewarp) without it having to wait in queue for whatever shader was running previously. This is useful when you're optimizing for latency, such as in VR - a benchmark for which nvidia was lambasted as being hugely inferior to AMD, and the issue afaik was then fixed in a driver update - queue people claiming these things can't be fixed in a driver because it's a hardware problem. I mean take a look at these fable legends results http://images.anandtech.com/graphs/graph9659/1080pi7.png http://www.extremetech.com/wp-content/uploads/2015/09/FableLegends.png noticing any strange inconsistencies ? It's odd to say the least Also from the say extremetech feature on fable, they have this graph http://www.extremetech.com/wp-content/uploads/2015/09/AMD-Perf1.png leaving aside the fact I don't know what these numbers mean, let's take this to be a perfect representation of performance in this game; Fury X outperforms a Ti, more interestingly a 390X is almost on par with r9 fury (I could have sworn there was an article about the fix for vr latency but I can't find it I found this https://www.reddit.com/r/oculus/comments/3rsezo/new_nvidia_driver_finally_allowed_me_to_try_vr/) If you're really interested read through this thread, somebody wrote a simple program to test pure compute and mixed loads on both gcn and maxwell https://forum.beyond3d.com/threads/dx12-performance-discussion-and-analysis-thread.57188/page-9#post-1869030 https://www.reddit.com/r/oculus/comments/3gwnsm/nvidia_gameworks_vr_sli_test/

In computer science, "Asynchronous" means decoupled in time. Asynchronous APIs (like OpenCL) return from function calls before the work has been completed. This is opposed to synchronous APIs, where the results of function calls are complete before they return. In computer science, "Concurrent" means executing at the same time. Asynchronous and concurrent are completely different concepts. In fact, in some ways they are opposite. Asynchronous explicitly means that you don't know when something executes - you can't say if two asynchronous processes are concurrent - all you know is that they are decoupled. In contrast, two concurrent processes must happen simultaneously, or else they are not concurrent. These are fundamentally different concepts, and in a forum like Beyond3D, where many people actually know things, we should use the established terminology. All GPU compute APIs are asynchronous and have been at least since the invention of CUDA. Most asynchronous interfaces are sequential. For example, if you've ever coded in Javascript, or Python, there are many asynchronous APIs that are implemented sequentially (like callbacks in event-driven systems, or green threads). You write code asynchronously, but the system decides how to execute it, and mostly does so sequentially. In just the same way, DX12 asynchronous compute is an asynchronous interface. This says nothing about execution, it is perfectly legal for the system to execute asynchronous compute shaders sequentially. The thing that everyone is excited about is *CONCURRENT* graphics and compute shaders. With DX12 and at least AMD hardware, it is possible to execute graphics calls and compute shaders *CONCURRENTLY*. This CONCURRENCY is what can potentially provide a performance benefit. The asynchrony has always been there - DX12 does not introduce asynchronous compute shaders. It instead opens the possibility for concurrent graphics and compute. The words asynchronous and concurrent have meaning for those of us in computer science.

TLDR: We need more information about everything

And how do we know from what hardware can benefit? There's not a whole lot of actual information on the topic available. There is however an insane amount of wild speculation and quasi technical explanations and people buying into what ever fits their world view basically. It's not a point of these new APIs to eliminate the influence of driver. It's about eliminating guesswork from driver and making the whole thing more predictable. As Andrew said above: "it's complicated". And I know for a fact that current NV drivers behave differently with regard to mixing Draw/Dispatch then older ones.

#5236637

Denial

2016-02-26 17:02

Yer I been keeping up to date. One thing I don't understand is that if DX12 is meant to be less reliant on drivers how come we need a driver fix for Async to fix it?

Nvidia currently does not have the hardware to do simultaneous compute/graphics queuing, at least as far as we know -- they essentially process everything serially at the moment. On Nvidia GPU's the majority of scheduling those tasks takes place in software, on the CPU. Some people are claiming Nvidia can work around the lack of hardware by using some of the parallelism stuff they build into the architecture for CUDA purposes. But these changes would take place at a driver level because the scheduling is at the driver level. Whether or not Nvidia can fix the problem with a driver remains to be seen. The head of their SW development team recently made a tweet that may have hinted towards this coming. And they have said in the past that they can can do it with drivers. But there hasn't yet been a driver that officially declares his solved or anything.

#5236640

HonoredShadow

2016-02-26 17:06

That's interesting and answers a few questions. Thanks guys. Will be watching these cards closely.

#5236641

GeniusPr0

2016-02-26 17:07

Not only does Fable not emphasize async compute, but nvidia has its own path, and amd theirs AFAIK. In the context of DX12 and what can and should be done with async compute, AMD has it right. Arguing about the semantics is useless. One is on one's own to determine what or shouldnt be explained.

#5236645

Ieldra

2016-02-26 17:16

Asynchrony, in computer programming, refers to the occurrence of events independently of the main program flow and ways to deal with such events.

I'm going to argue asynchronous compute is a misnomer and just obfuscates the issue

Nvidia currently does not have the hardware to do simultaneous compute/graphics queuing, at least as far as we know -- they essentially process everything serially at the moment. On Nvidia GPU's the majority of scheduling those tasks takes place in software, on the CPU. Some people are claiming Nvidia can work around the lack of hardware by using some of the parallelism stuff they build into the architecture for CUDA purposes. But these changes would take place at a driver level because the scheduling is at the driver level. Whether or not Nvidia can fix the problem with a driver remains to be seen. The head of their SW development team recently made a tweet that may have hinted towards this coming. And they have said in the past that they can can do it with drivers. But there hasn't yet been a driver that officially declares his solved or anything.

They did solve the VR latency issue in drivers amidst a frenzy of criticism and accusations of the problem being in hardware

#5236648

GeniusPr0

2016-02-26 17:19

It is a misnomer.

#5236651

Denial

2016-02-26 17:22

Not only does Fable not emphasize async compute, but nvidia has its own path, and amd theirs AFAIK. In the context of DX12 and what can and should be done with async compute, AMD has it right. Arguing about the semantics is useless. One is on one's own to determine what or shouldnt be explained.

Of course AMD has it right, they wrote the specification for it. Fable uses Async compute fine, its just that it's not computing hundreds of different light sources like AoS is. The type of game definitely matters in how much it gets used and obviously an RTS with massive battles and hundreds of light sources is going to emphasize the performance over a third person moba. Which was my point in the other thread, AoS is literally worse case for Nvidia's performance in regards to Async. Games should have two different paths, eventually every architecture is going to have it's own path. That's part of DX12's strength, the developer is capable of optimizing specifically for the architecture and vendor.

They did solve the VR latency issue in drivers amidst a frenzy of criticism and accusations of the problem being in hardware

I don't know enough about the VR latency stuff to really comment on it. I know that I have a Oculus DK2 and I used it on my 980 and it seemed to work fine in every game I played, including the Crytek Dino demo thing. But aside from that, no idea.

#5236657

Ieldra

2016-02-26 17:30

I don't know enough about the VR latency stuff to really comment on it. I know that I have a Oculus DK2 and I used it on my 980 and it seemed to work fine in every game I played, including the Crytek Dino demo thing. But aside from that, no idea.

Check out that beyond3d thread, you'll find it interesting

#5236673

Fox2232

2016-02-26 18:07

Fury X outperforms a Ti, more interestingly a 390X is almost on par with r9 fury

Explanation is simple. Both Hawaii and Fiji have same amount of ROPs (64) and ACEs (8). And probably many other things remained in same count or unimproved on uArch level. There can be scenario where r9-290x/390x performs better than Fury X, and that is when application relies only on pixel fillrate. As core of Hawaii can be clocked higher and therefore it can achieve higher fillrate. But in that case 980Ti would beat both chips as it has 96 ROPs and like 50% higher fillrate as it has quite some boost out of the box. If game was all about texture fillrate, then 980Ti = 290x/390x and Fury X would take crown by far. And those are just 2 aspects out of dozens which matter. And weakest point for each requirement matters. And that's way how gtx 960 can outperform gtx 780Ti.

#5236702

Ieldra

2016-02-26 18:56

Explanation is simple. Both Hawaii and Fiji have same amount of ROPs (64) and ACEs (8). And probably many other things remained in same count or unimproved on uArch level. There can be scenario where r9-290x/390x performs better than Fury X, and that is when application relies only on pixel fillrate. As core of Hawaii can be clocked higher and therefore it can achieve higher fillrate. But in that case 980Ti would beat both chips as it has 96 ROPs and like 50% higher fillrate as it has quite some boost out of the box. If game was all about texture fillrate, then 980Ti = 290x/390x and Fury X would take crown by far. And those are just 2 aspects out of dozens which matter. And weakest point for each requirement matters. And that's way how gtx 960 can outperform gtx 780Ti.

I just read through my post again, I understand what you're saying, just not where you're going with it. My point was that the scaling is behaving erratically and doesn't seem to make much sense. How is the 390X scaling so well compared to Fiji ? Pixel fillrate you say, then the Ti should be scaling well too Texture fillrate then, FX takes the crown with Hawaii/Maxwell trailing ? Not enough of a delta btw FX and Ti/Hawaii I'm just saying it's complicated and these benchmarks are being taken to mean far more than they really do