Micron Starts Sampling GDDR5X Memory to Customers

Kaarme

2016-03-24 10:57

Can't the manufacturers just use it for their profit by further cutting the memory bus width since less can give the same result if its faster? They seem to like doing that of late.

#5249505

Robbo9999

2016-03-24 11:02

Can't the manufacturers just use it for their profit by further cutting the memory bus width since less can give the same result if its faster? They seem to like doing that of late.

It doesn't matter though does it? Just need to make sure there's enough bandwidth available for the core - does it really matter if they do that through better VRAM or through memory bus widening?

#5249511

Fox2232

2016-03-24 11:35

GDDR5 has 170 pins per package, GDDR5X has 190 pins per package. Their prognosis for early chips is 25% higher throughput per pin. Signaling pin count went up from 67 to 68 which does not increase complexity of PCB. Means one chip should deliver 1.25 * 190/170 = 1.397 ~ 40% higher transfer rate than GDDR5 standard package. Or does that mean 1.25 * 68/67 = 1.269 ~ 27% higher transfer rate than GDDR5 standard package? Number of pins goes apparently up by ~12%. Chip is accessible in x32 and x16 mode (faster and slower). But signaling pin count stays practically same, so PCB complexity stays same per memory chip. x16 mode is to allow communication through lower amount of traces (lower speed), but to allow another set of traces to connect another chip ("doubling" capacity per traces used). - In reality one GDDR5X is connected in standard way and another is connected behind it (easily on other side of PCB). I can see potential for next GTX970. While memory controller may be 384bit, 2 chips will share traces for higher capacity. I'll keep an eye on memory layout from now on when I check GDDR5X graphics cards. - this sharing is done in following way: - > GDDR5X uses 4 data and 4 error correction channels, has command and address bus - > in x16 mode 1st memory chip gets 2 out of 4 data and 2 out of 4 error correction channels. 2nd chip gets another 2 of each. 1st chip gets command and address bus and relays required information to 2nd chip if appropriate (extra latency?). Edit: Actually little correction for underlined text: It seems that command and address bus is directly shared, so there is another logic behind which tells chips if command is for them (maybe both read required address and check if it is in their scope). (still extra latency)

#5249535

Xionor

2016-03-24 12:57

Well this suddenly makes the recent Geforce X80 leaks very plausible. (Lurker for many years, first post ever.)

#5249852

Robbo9999

2016-03-25 07:52

GDDR5 has 170 pins per package, GDDR5X has 190 pins per package. Their prognosis for early chips is 25% higher throughput per pin. Signaling pin count went up from 67 to 68 which does not increase complexity of PCB. Means one chip should deliver 1.25 * 190/170 = 1.397 ~ 40% higher transfer rate than GDDR5 standard package. Or does that mean 1.25 * 68/67 = 1.269 ~ 27% higher transfer rate than GDDR5 standard package? Number of pins goes apparently up by ~12%. Chip is accessible in x32 and x16 mode (faster and slower). But signaling pin count stays practically same, so PCB complexity stays same per memory chip. x16 mode is to allow communication through lower amount of traces (lower speed), but to allow another set of traces to connect another chip ("doubling" capacity per traces used). - In reality one GDDR5X is connected in standard way and another is connected behind it (easily on other side of PCB). I can see potential for next GTX970. While memory controller may be 384bit, 2 chips will share traces for higher capacity. I'll keep an eye on memory layout from now on when I check GDDR5X graphics cards. - this sharing is done in following way: - > GDDR5X uses 4 data and 4 error correction channels, has command and address bus - > in x16 mode 1st memory chip gets 2 out of 4 data and 2 out of 4 error correction channels. 2nd chip gets another 2 of each. 1st chip gets command and address bus and relays required information to 2nd chip if appropriate (extra latency?). Edit: Actually little correction for underlined text: It seems that command and address bus is directly shared, so there is another logic behind which tells chips if command is for them (maybe both read required address and check if it is in their scope). (still extra latency)

Well I don't understand all that, but I have heard that GDDR5x is supposed to be offering twice the bandwidth of current GDDR5 (in addition to that same 'fact' being mentioned in this article) - through greater efficiencies mainly, but also through higher clock speeds.

#5249854

Robbo9999

2016-03-25 07:58

Well this suddenly makes the recent Geforce X80 leaks very plausible. (Lurker for many years, first post ever.)

And welcome! (not that I'm the best person to welcome you, I've not been posting here that long either!)

#5249858

poornaprakash

2016-03-25 08:32

There are rumors about Nvidia Pascal being an improved Maxwell core with HPC capabilities, which still lacks crucial Dx12 Async Compute feature. "According to our sources, next GPU micro architecture Pascal from NVIDIA will be in trouble if it will have to heavly use Asynchronous Compute code in video games. Broadly speaking, Pascal will be an improved version of Maxwell, especially about FP64 performances, but not about Asyncronous Compute performances. NVIDIA will bet on raw power, instead of Asynchronous Compute abilities. This means that Pascal cards will be highly dependent on driver optimizations and games developers kindness. So, GamesWorks optimizations will play a fundamental role in company strategy. Is it for this reason that NVIDIA has made publicly available some GamesWorks codes?" Source: http://www.bitsandchips.it/52-english-news/6785-rumor-pascal-in-trouble-with-asyncronous-compute-code

#5249926

Fox2232

2016-03-25 12:54

Well I don't understand all that, but I have heard that GDDR5x is supposed to be offering twice the bandwidth of current GDDR5 (in addition to that same 'fact' being mentioned in this article) - through greater efficiencies mainly, but also through higher clock speeds.

What you see on image is GDDR5 8Gbps per pin for quire some time. And GDDR5X in middle of 2016 with 10Gbps slowly going up to 16Gbps in around 2020. For now it promises 25% more data per pin per second. I would not expect more from it. Yes, it delivers power efficiency and it looks like internal complexity is bit lower, so price in long run may go down faster than HBM1. GDDR5X is definitely easier to adopt than HBM, but unless we start to believe GPU manufacturers are going to bring 512bit GDDR5X cards, all we can expect in middle of 2016 is 384bit card with GDDR5X matching transfer speeds of 512bit GDDR5 cards. Apparently that's not problem for nVidia as they did just fine with 384bit bus on Titan X, and having like 30% more data available to GPU (without beefing up memory controller to 512bit) may allow for like 50% more TMU/ROPs/Shaders. In the end there are operations which require a lot of memory bandwidth (High res textures/AA on 4k resolution). And there are methods which require GPU raw power to calculate something (geometry, shadow). Today we are already close to bandwidth required for 4k. Both GDDR5X and HBM are capable to deliver sufficient bandwidth for 4k. GPU just has to be capable to make good use of it.

#5249941

norton

2016-03-25 13:41

There are rumors about Nvidia Pascal being an improved Maxwell core with HPC capabilities, which still lacks crucial Dx12 Async Compute feature. "According to our sources, next GPU micro architecture Pascal from NVIDIA will be in trouble if it will have to heavly use Asynchronous Compute code in video games. Broadly speaking, Pascal will be an improved version of Maxwell, especially about FP64 performances, but not about Asyncronous Compute performances. NVIDIA will bet on raw power, instead of Asynchronous Compute abilities. This means that Pascal cards will be highly dependent on driver optimizations and games developers kindness. So, GamesWorks optimizations will play a fundamental role in company strategy. Is it for this reason that NVIDIA has made publicly available some GamesWorks codes?" Source: http://www.bitsandchips.it/52-english-news/6785-rumor-pascal-in-trouble-with-asyncronous-compute-code

would you please stop filling the forums with those craps 🙂

#5249953

Noisiv

2016-03-25 14:07

would you please stop filling the forums with those craps 🙂

poornaprakash... haha WTH

#5249962

Stormyandcold

2016-03-25 14:21

Brute force approach it is then. Logically, Nvidia only has to be 5-10% faster than AMD to negate the benefits of AC for a tie. We can also expect a nice boost to dx11.

#5249986

Fox2232

2016-03-25 15:47

Brute force approach it is then. Logically, Nvidia only has to be 5-10% faster than AMD to negate the benefits of AC for a tie. We can also expect a nice boost to dx11.

Probably wrong thread, but Bulldozer had two 128bit FPU which in time of need were combined to one 256bit FPU. (Or maybe it had just one 256bit FPU and could run on it 2 128bit FPU sets at same time?) Similar thing goes with graphics cards FP32 vs FP64, FP32 is practically double of FP64 performance or as much as 20times faster depending on GPU configuration. IIRC Titan could switch this mode on the fly. Because transistors are shared in a way. With that in mind I wonder how many more transistors per CU block AMD needs to add in order to double rendering/shader computing or for us standard 8bit per color processing than they have now? Because they already may be at 80% of required transistor structure as those transistors are used for some other kind of operation. What I mean is that executing 2 different things at same time in one hierarchical block should not be ignored. Imagine ASIC built just for FP64 which would have everything necessary for FP32 twice per FP64 block, but would not allow for 2 simultaneous FP32 executions. And therefore have absolutely no benefit in speed for FP32 and only thing different between FP32 and FP64 would be precision. Look at intel with HT, it is basically technology made to push into CPU as much work as possible. AMD with Bulldozer went with opposite strategy, more workers with no heavy pushing doing that bare minimum work they were made for. It is like having few super fast trains in Japan where you push people inside as much as you can. Or having 4 times as many slow trains and organizational problems as bonus to slowness. What is more efficient? What is more costly? We want to have future silicons capable to process more smaller things at same time when big thing is not using shared architecture completely.

#5249994

PrMinisterGR

2016-03-25 16:13

If sampling happens now, whatever comes out needs at least 4-5 months until it could go to actual mass production. There is also the question of the memory controller design with no memory samples.

#5250017

Fox2232

2016-03-25 16:38

Not necessarily, you do not even need to know final pin layout on memory chip. All you need is to know final specification for: - how many signaling pins it uses - way signals are processed - and frequency it operates at (+timing) Same way AMD built fiji and at time HBM had early samples, AMD had early fijis and they had it connected on early interposer. Question of availability is valid, but GDDR5X in not revolution. It is not complex thing to make. Volume production for this is as big question of time as asking factory in china to start producing 130nm ASIC consisting of 2 million transistors. They'll ask you if you want 10 million chips 1st month or 50 million.

#5250052

cowie

2016-03-25 18:16

I think when they start sendin samples its like a beta game pretty much its what you are going to get unless it something very weird. pinout and all things needed to implement this ram to your hardware is out. \they could be making cards with this ram already pdf that was updated with Ball Assignments and Descriptions https://www.micron.com/parts/dram/gddr5/mt58k256m32ja-120?pc={A3DEA33E-FC34-4856-9365-52D6D71B95BA}

#5250060

PrMinisterGR

2016-03-25 18:46

You both have very valid points, we'll see at around May. I believe that whoever launches with GDDR5x will do so with "soft" launches. We'll see.

#5250082

evilkiller650

2016-03-25 20:27

There are rumors about Nvidia Pascal being an improved Maxwell core with HPC capabilities, which still lacks crucial Dx12 Async Compute feature. "According to our sources, next GPU micro architecture Pascal from NVIDIA will be in trouble if it will have to heavly use Asynchronous Compute code in video games. Broadly speaking, Pascal will be an improved version of Maxwell, especially about FP64 performances, but not about Asyncronous Compute performances. NVIDIA will bet on raw power, instead of Asynchronous Compute abilities. This means that Pascal cards will be highly dependent on driver optimizations and games developers kindness. So, GamesWorks optimizations will play a fundamental role in company strategy. Is it for this reason that NVIDIA has made publicly available some GamesWorks codes?" Source: http://www.bitsandchips.it/52-english-news/6785-rumor-pascal-in-trouble-with-asyncronous-compute-code

NVIDIA are sneaky devils!! :banana:

would you please stop filling the forums with those craps 🙂

Why is it crap? Stuff like that is very, very interesting to know and will help people decide on what cards to buy in the future.

#5250095

GhostXL

2016-03-25 20:49

NVIDIA are sneaky devils!! :banana: Why is it crap? Stuff like that is very, very interesting to know and will help people decide on what cards to buy in the future.

Nvidia Gameworks is there for effects that are optimized for Nvidia cards. AMD is welcome to do the same thing, and they have. So there is no fuss here, just more people spreading FUD and guessing. It's all rumor until the card launches, remember that. 🙂

#5250098

PrMinisterGR

2016-03-25 21:08

GameWorks is there to create a separate software ecosystem by exploiting the dominance of NVIDIA hardware sales in the last year, increasing those sales in return. Don't read to much more into it really. Whatever they open sourced is already optimized for, more or less. AMD is going for "open" because they really have no choice at this point, although they tend to be more friendly in the ecosystem than NVIDIA is (they are not known as the Graphics Mafia after all 😀 )

#5250111

norton

2016-03-25 22:06

NVIDIA are sneaky devils!! :banana: Why is it crap? Stuff like that is very, very interesting to know and will help people decide on what cards to buy in the future.

ohhhh wow, you take your own decision based on a RUMOR from non reputable site based on NO SOURCE nice man you really deserve those rumors so you can get the wrong hardware in your next built :stewpid: