Nvidia drops Samsung and uses TSMC for Pascal

Published by

Click here to post a comment for Nvidia drops Samsung and uses TSMC for Pascal on our message forum
https://forums.guru3d.com/data/avatars/m/124/124168.jpg
I'm glad nvidia did not use hbm 1 with Maxwell. I'd rather have 6gb of fast gddr5 360gb/s I believe at 7516 than 4gb hbm1. Unless gddr5 becomes a bottleneck the only benefit of hbm is power savings imo.
https://forums.guru3d.com/data/avatars/m/248/248994.jpg
HBM2 is a JEDEC standard now, so any memory company could develop HBM2 modules. AMD is paired with SK Hynix so they have an advantage there. I don't know who Nvidia is pairing with for HBM. There is no exclusivity deal despite what some people say. The advantage is in the fact that AMD/SK essentially wrote the rulebook on HBM -- every other company including Nvidia still needs to figure it out and build it.
I suppose I should spend more time googling it, but who owns critical patents regarding HBM? People say AMD has been researching it for years (considering they released a product with it first, that seems true), so it would be insane if they haven't patented a lot along the way.
https://forums.guru3d.com/data/avatars/m/259/259240.jpg
For quite a while now, I've been biting my lip and not purchased anything from the 'Kepler' card series or the 'Maxwell' card series, but I'm sick and tired that anything I drink or eat has a bloody taste to it. nVidia's Pascal will be the ticket, which will allow my lip to finally heal.
https://forums.guru3d.com/data/avatars/m/237/237771.jpg
It would be interesting to see whether they've fixed the issue with the Async compute ACE. I wouldn't think it's something they could incorporate into the design this late seeing as it has only recently been highlighted as an issue. That said, maybe they'll get incredibly lucky!
The thing is ASE is not in the DX12 standard. It's something that AMD and its arms are pushing.
https://forums.guru3d.com/data/avatars/m/124/124168.jpg
Maybe they wanted to deflect criticism of the nano launch, worked for a while.
data/avatar/default/avatar04.webp
Good news ! finally I am looking at something to replace my entire system with. Hope we get PCI 4 to cope with that bandwidth.
https://forums.guru3d.com/data/avatars/m/80/80129.jpg
The thing is ASE is not in the DX12 standard. It's something that AMD and its arms are pushing.
It's not part of the DX12 standard but it's definitely a good feature. I mean Nvidia did once support reordering in hardware but dropped it for power reasons. I personally think we need to see more data on it first to know whether or not Nvidia needs it for performance. Maxwell's pipeline is much shorter than GCN's and it's also much faster. The time a command is spent in the pipeline is less, so serializing compute and graphics tasks wouldn't occur as slow as it would on GCN's without ASync. And that's not to say GCN's architecture is bad -- it's just different and ASync commands will help it more than it will for Nvidia. The only current benchmark we have is AoS and the Fury X and the 980Ti tie in performance. That's great news for AMD owners as it's pretty clear they get a speed up. But it's not bad news for Nvidia owners -- their cards are still performing on par with the competition, the gap just closed. That being said, ASync is apparently extremely useful in VR for reducing latency. Nvidia kind of already does this with it's own implementation of Asynchronous warp, but it's a software based warp and it requires game developers to implement Nvidia's code in software and whatnot. AMD can do it on a hardware level, which means it's probably faster and any developer can use DX12 to essentially take advantage of the hardware scheduler to perform the action.
data/avatar/default/avatar24.webp
It's not part of the DX12 standard but it's definitely a good feature. I mean Nvidia did once support reordering in hardware but dropped it for power reasons. I personally think we need to see more data on it first to know whether or not Nvidia needs it for performance. Maxwell's pipeline is much shorter than GCN's and it's also much faster. The time a command is spent in the pipeline is less, so serializing compute and graphics tasks wouldn't occur as slow as it would on GCN's without ASync. And that's not to say GCN's architecture is bad -- it's just different and ASync commands will help it more than it will for Nvidia. The only current benchmark we have is AoS and the Fury X and the 980Ti tie in performance. That's great news for AMD owners as it's pretty clear they get a speed up. But it's not bad news for Nvidia owners -- their cards are still performing on par with the competition, the gap just closed. That being said, ASync is apparently extremely useful in VR for reducing latency. Nvidia kind of already does this with it's own implementation of Asynchronous warp, but it's a software based warp and it requires game developers to implement Nvidia's code in software and whatnot. AMD can do it on a hardware level, which means it's probably faster and any developer can use DX12 to essentially take advantage of the hardware scheduler to perform the action.
In addition, it is needed to understand that Async compute is allready used in some consoles games, and vastly adopted right now by consoles game developpers ... nearly every consoles games will use it... You cant imagine the numbers of presentation on Siggraph and other developpers conference who have got presentations from consoles games developpers about Async .. Even UE4 have allready implement it. Frostbyte, Crytek if im not wrong, and even UBisoft is allready experience with it. Compute is way more used than what we can think in actual games. so i let you imagine how much it can be benefit to use Async compute.. The list of compute used on actual games is really long and all are excellent case for Async compute. I will quote Sebbi from Beyond3D
Media Molecule's new game "Dreams" has a GPU pipeline that is fully compute shader based (no rasterization at all). Q-Games (Tomorrow's Children) had quite nice performance improvements from asynchronous compute in their global illumination implementation. DICE (Frostbite) is using asynchronous compute for character skinning. Their presentation described skinning being almost free this way, as the async skinning fills holes in GPU execution. Recent presentations from game/technology developers have shown lots of uses cases for asynchronous compute (on consoles). As DX12 supports multiple compute queues, we will certainly see similar optimizations on PC.
Async compute is not new for the developpers, its just have start to been shown to the public only lately. Most developpers are allready working at implement it.
https://forums.guru3d.com/data/avatars/m/206/206288.jpg
People keep saying that nearly every console game will use it, but i can't see any developers saying that every console game will use it. With the low CPU speeds of both consoles it looks like a very important feature, yet uptake has not been great, and with PC games and the far better hardware i'm not convinced it's necessary, and won't be until i see a benchmark that shows the benefits.
data/avatar/default/avatar26.webp
People keep saying that nearly every console game will use it, but i can't see any developers saying that every console game will use it. With the low CPU speeds of both consoles it looks like a very important feature, yet uptake has not been great, and with PC games and the far better hardware i'm not convinced it's necessary, and won't be until i see a benchmark that shows the benefits.
At the base only the PS4 ( and Mantle ) was support it. DirectX 12 will allow to be used on more plateform. Async compute is not so much related to CPU than the GPU, it permit to execute concurrently graphic and compute queue.. this way the gpu have not to wait a graphic task is finished for send the compute queue. Without Async, the tasks are serialized in one pipeline..
https://forums.guru3d.com/data/avatars/m/56/56686.jpg
1 TB bandwidth... cant wait to see that
data/avatar/default/avatar35.webp
1 TB bandwidth... cant wait to see that
HBM1 are clocked at 500mhz, but it OC to 1000mhz on the Fury, so allready 1TB... HBM2 allow more capacity, who will scope with actual DDR5 ( 32GB on the AMD Firepro )
https://forums.guru3d.com/data/avatars/m/248/248994.jpg
People keep saying that nearly every console game will use it, but i can't see any developers saying that every console game will use it. With the low CPU speeds of both consoles it looks like a very important feature, yet uptake has not been great, and with PC games and the far better hardware i'm not convinced it's necessary, and won't be until i see a benchmark that shows the benefits.
Pardon my suspicion, but how do you know that? In the past, during my game modding days, I conversed a bunch of times with game developers from a moderately well-known game studio, but they certainly weren't overly forthcoming about any unrelated "company secrets" even though they did their best to help me and lots of other modders trying to push the mod friendly engine to its limits. There are hundreds of games being developed all the time all over the world, and I'm quite sure the myriad studios or indie developers won't advertise all of their performance tricks.
https://forums.guru3d.com/data/avatars/m/232/232504.jpg
I don't know why, for some time now Samsung is my most trusted company -almost ever. Even the extremely reliable high frequency memories maxwells use are Samsungs. Let alone that my only option for SSD is the 850. Don't know how I feel about this, I'm certainly out of my waters in order to have any objective opinion. What about a little more patience in order instead of 16nm, Samsung offers 14nm.... http://i2.minus.com/iVRre4lxs1kLn.PNG
https://forums.guru3d.com/data/avatars/m/232/232504.jpg
Well if you always wait for the next gpu you will wait forever. But I am planning on getting either amd or nvidia on smaller node next year. And then going for amd zen or kaby lake for my next build. Will be interesting if AMD goes for glofo/samsung or TSMC.
The wait forever game is a myth for logical-thinking people. Right now none should be buying new GPUs, but given the rumors that Pascal will come much longer afterwards, I don't know. A Nvidia GPU with HBM2, Samsung Memories and 14nm proccess and full DX12 support, I'd give my kidney for an equivalent of the 970. Unless you're only one of those boys with unlimited resources and throwing a couple of dollards every now and then is nothing. I surely plan to get one and even better and if I could hold off of at least the next generation ( like the 700 series of kepler ). On the other hand you need a lot of patience but I don't speak at all for the wait-for-ever game. When I'll get my pascal is gonna stay with me at least 4 years or more.
https://forums.guru3d.com/data/avatars/m/243/243702.jpg
The wait forever game is a myth for logical-thinking people. Right now none should be buying new GPUs, but given the rumors that Pascal will come much longer afterwards, I don't know. A Nvidia GPU with HBM2, Samsung Memories and 14nm proccess and full DX12 support, I'd give my kidney for an equivalent of the 970. Unless you're only one of those boys with unlimited resources and throwing a couple of dollards every now and then is nothing. I surely plan to get one and even better and if I could hold off of at least the next generation ( like the 700 series of kepler ). On the other hand you need a lot of patience but I don't speak at all for the wait-for-ever game. When I'll get my pascal is gonna stay with me at least 4 years or more.
I wonder if you intend to get Pascal top end GPU which will cost like $1500, because it will double transistor count to 980Ti while price per transistor for 16nm will not be considerably lower. It will likely be in proximity of price per transistor which we are used to on 28nm +-20%. Same goes for AMD. Do not expect twice performance from 16/14nm new/updated architectures for same price as you paid for your 28nm card. If architecture brings 20% higher efficiency, then that's what one should expect to get from waiting for 16/14nm in his spending range. Anyone who needs new GPU and is willing to spend $200 now will at best get 20% higher performance for $200 in year. In other words, no different than getting 1st generation kepler and then getting 20% stronger kepler year later for same price. And waiting for full DX12 tier_3 gpus... Features which nV/AMD are missing are not those I really need to use. Something what affects 1% of pixels on screen and costs 20% of performance? I'll rather burn that on downsampling or enjoy higher fps.
data/avatar/default/avatar31.webp
If the 20 percent performance boost is strictly a result of the smaller process, you also have to consider that they are rumoured to contain many more transistors. Additionally, there are completely new architectures involved from both Nvidia and AMD.
TSMC's 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. http://www.tsmc.com/english/dedicatedFoundry/technology/16nm.htm Nvidia will add mixed precision, Nvlink and HBM2. But no radically new architecture that's for sure. Pascal is proly just Maxwell V2 shrink with added tidbits. Why? Because Nvidia has done their homework with Maxwell too well and have too much performance on hand with shrink alone. How much of that performance will be passed to us depends on GTX 900 sales in the upcoming months. I'd like to say it depends on AMD too, but I no longer have illusions they'll magically catch-up in a meaningful way. Good news is that in order to prosper Nvidia has to keep pushing huge number of cards. And that includes self-price-check and attractive pricing. I can see a killer card like GTX 970 somewhere in Q2/Q3.
https://forums.guru3d.com/data/avatars/m/243/243702.jpg
TSMC's 16FF+ (FinFET Plus) technology can provide above 65 percent higher speed, around 2 times the density, or 70 percent less power than its 28HPM technology. http://www.tsmc.com/english/dedicatedFoundry/technology/16nm.htm Nvidia will add mixed precision, Nvlink and HBM2. But no radically new architecture that's for sure. Pascal is proly just Maxwell V2 shrink with added tidbits. Why? Because Nvidia has done their homework with Maxwell too well and have too much performance on hand with shrink alone. How much of that performance will be passed to us depends on GTX 900 sales in the upcoming months. I'd like to say it depends on AMD too, but I no longer have illusions they'll magically catch-up in a meaningful way. Good news is that in order to prosper Nvidia has to keep pushing huge number of cards. And that includes self-price-check and attractive pricing. I can see a killer card like GTX 970 somewhere in Q2/Q3.
Those 65/70% are apparently sitting at opposite sides of frequency <-> power consumption graphs. one or another. And what is likely that those values are for small SoC not monster like GPUs. But even if we could get 50% higher clock for let's say shrunk GTX 980Ti while keeping decent temperature (due to increased transistor & heat density), what would nVidia/AMD ask for (new expensive to make 16/14nm GPU) which is 50% faster than our current top card?
data/avatar/default/avatar15.webp
Those 65/70% are apparently sitting at opposite sides of frequency <-> power consumption graphs. one or another. And what is likely that those values are for small SoC not monster like GPUs. But even if we could get 50% higher clock for let's say shrunk GTX 980Ti while keeping decent temperature (due to increased transistor & heat density), what would nVidia/AMD ask for (new expensive to make 16/14nm GPU) which is 50% faster than our current top card?
I would expect $1000 Titan at first. And possibly we won't get all they can do in the first release. Because of the yields and because they need to leave something for a refresh. 2 times density is mandatory. For GM200 shrink: its either 165% of 980 Ti MHz at same 250W OR TDP=0.3 x 250W at same MHz speed like 980 Ti. But that is now a mere ~300mm2 chip :banana: OTOH they could slap 2 GM200 together @150W (0.3 x 2 x 250W) for roughly twice the performance of 980 Ti. And that is without HBM2 benefits. Subtract some % due to real world. So you see why they have "too much" performance, and the issue is not if they can, but how much they are willing to pass to us. The real issue will be the 16nm FF process itself, yields, supply etc.
https://forums.guru3d.com/data/avatars/m/56/56686.jpg
HBM1 are clocked at 500mhz, but it OC to 1000mhz on the Fury, so allready 1TB... HBM2 allow more capacity, who will scope with actual DDR5 ( 32GB on the AMD Firepro )
i could careless about AMD products I talking Nvidia here.