Cray Shasta supercomputer holds almost 300,000 AMD processor cores

#5761424

Undying

2020-02-18 09:08

But can it run crysis? Haah, sry i had to. 😀

#5761430

Astyanax

2020-02-18 09:19

and 112 NVIDIA Volta V100

~~pretty sure its going to have 0 V100's in it, it was going to use v100's but~~ oh, this is not the indiana university shay supercomputer.

#5761433

anticupidon

2020-02-18 09:26

This is a small ripple in the ocean of super computers and data centres. But it will escalate and transform into a wave. Better competition in the data centers.

#5761455

Mesab67

2020-02-18 11:38

The entire basis of supercomputers is parallelism and thus, being able to split your workload over as many available cores (CPU & GPU) as possible. Would be good to have some of these experienced folks share some of their highly tuned workload-splitting skills with the outside world...there's many areas that could benefit. {edit: talking primarily about algorithm/scaling skills]

#5761464

0blivious

2020-02-18 12:13

CHESS POKER FIGHTER COMBAT GUERRILLA WARFARE DESERT WARFARE THEATERWIDE TACTICAL WARFARE CRYSIS GLOBAL THERMONUCLEAR WAR .

#5761471

Alessio1989

2020-02-18 12:33

Undying:

But can it run crysis? Haah, sry i had to. 😀

I do not see why not. Crysis can run on what today implements the Microsoft Basic Render driver with a simple rename trick... https://docs.microsoft.com/en-us/windows/win32/direct3darticles/directx-warp#warp-architecture-and-performance It does support up to Direct3D10, I bet it doesn't take so much advantage of modern CPUs, at least I bet the d3d10warp hasn't been updated so much since Vista (at least according to Microsoft warp10 implements up to SSE 4.1).. Note also that d3d10warp.dll is the only warp implementation provided in Windows outside Windows SDK debug environment (in this last you can find the updated version of Warp for DirectX 12 too).

#5761525

schmidtbag

2020-02-18 14:48

If they went with dual-socket 64-core systems, that would be 2268 servers. Considering it seems not every server has a GPU in it, they probably used U1 servers, which means they could fit about 8 in a single rack. That means they'd have about 284 racks. For a supercomputer, that's astonishingly small.

Undying:

But can it run crysis? Haah, sry i had to. 😀

I don't suppose you saw the LTT video where they ran Crysis on software rendering using Epyc? Because that actually managed to breathe new life into an old meme.

#5761531

Brasky

2020-02-18 15:00

Mesab67:

The entire basis of supercomputers is parallelism and thus, being able to split your workload over as many available cores (CPU & GPU) as possible. Would be good to have some of these experienced folks share some of their highly tuned workload-splitting skills with the outside world...there's many areas that could benefit.

we're not all equal and the US taxpayer bought this, therefore it will and should be used for US projects. But the beauty of this being in the hands of the Navy is that many DoD/Navy projects are oceanography and mapping the tides/currents going on which will happen to include a ton of civilian science projects. So there WILL be tons of residual data that universities will likely have access too.

#5761545

Yogi

2020-02-18 16:22

schmidtbag:

If they went with dual-socket 64-core systems, that would be 2268 servers. Considering it seems not every server has a GPU in it, they probably used U1 servers, which means they could fit about 8 in a single rack. That means they'd have about 284 racks. For a supercomputer, that's astonishingly small.

That's at least 20 million bucks worth of eypc goodness chugging down about a megawatt of power at full chooch. And thats just with conservative wikipedia based estimates on the CPU's alone.

#5761563

ruthan

2020-02-18 17:54

Department of Defense.. nice to be in business of killing people.. i thing that with this they can make lots of frags:) I thing that whole simulation would be about how much Blackwater mercenaries to hire:( Do you remember on Iraq in 90s and info about how much scat missiles actually hit the target funny stats:)

#5761567

386SX

2020-02-18 18:21

Just another example of "things you are not able to truly understand if you haven't used it at least once in your life". Hi Skynet! 😀

#5761571

tsunami231

2020-02-18 18:46

386SX:

Just another example of "things you are not able to truly understand if you haven't used it at least once in your life". Hi Skynet! 😀

I am sure alot people will laugh at the "skynet" comments but sooner or later it gona happen. "virus" that donald sutherland movie is another scenario I see happen where an alien ( they existant and are already here public just dont know or have proof) entity, see's humans as virus to planet and deem human must all die, which again involved machines. [SPOILER=" kind turn in to rant about nation debit"]With all the money USA spends it all this and other stuff, they could easily move expenses around and pay off the nations debt. before we are more in debt to china? which from what I read it where we been getting loans? or so new i read about says reguardless of where it coming from we have 1 billion in debit last i looked which come from internet, which full of truths?!! Less what they mean by nations debit, the combine debit of all it citzens? which that case that will never be fixed they want us in debit. then we have people that have 100k and more in debit that just keep spend out side there means with the thought of hey i gona die soon they can get money from me if i dead I know few people like that. [/SPOILER]

#5761597

Gomez Addams

2020-02-18 20:18

schmidtbag:

If they went with dual-socket 64-core systems, that would be 2268 servers. Considering it seems not every server has a GPU in it, they probably used U1 servers, which means they could fit about 8 in a single rack. That means they'd have about 284 racks. For a supercomputer, that's astonishingly small.

They can have more than 8 in one rack. According to other releases, they use 1U boxes and put eight CPUs in each one. If they use the 64-core chip that means 4536 chips and 567 boxes. A standard rack is 42U high so they probably put 20 or so CPU boxes per rack and the rest of the rack is storage and networking. At twenty boxes per rack that means 28 racks are needed. Each V100 GPU will likely have two CPUs managing it, probably in a 2U box. Assuming 2U boxes, that will require three racks just for the GPUs if they are 1GPU per box. This all totals less than three dozen racks. In relative terms, for a Top-25 supercomputer this is microscopic.

#5761601

svan71

2020-02-18 20:29

ruthan:

Department of Defense.. nice to be in business of killing people..

your confusing DOD with Planed Parenthood bro.

#5761614

schmidtbag

2020-02-18 21:35

Gomez Addams:

They can have more than 8 in one rack. According to other releases, they use 1U boxes and put eight CPUs in each one. If they use the 64-core chip that means 4536 chips and 567 boxes. A standard rack is 42U high so they probably put 20 or so CPU boxes per rack and the rest of the rack is storage and networking. At twenty boxes per rack that means 28 racks are needed. Each V100 GPU will likely have two CPUs managing it, probably in a 2U box. Assuming 2U boxes, that will require three racks just for the GPUs if they are 1GPU per box. This all totals less than three dozen racks. In relative terms, for a Top-25 supercomputer this is microscopic.

Ah whoops I forgot about the taller 42Us (it's been a long time since I've been in a mainframe). All good points though. And yeah, it's almost funny how small this supercomputer is. It should be relatively easy to maintain.

#5761798

Brasky

2020-02-19 14:37

schmidtbag:

Ah whoops I forgot about the taller 42Us (it's been a long time since I've been in a mainframe). All good points though. And yeah, it's almost funny how small this supercomputer is. It should be relatively easy to maintain.

I'm glad someone was able to do that math. Definitely sounds like a pretty small footprint compared to the ones that need an entire facility built to house the SC.

#5761826

JamesSneed

2020-02-19 15:54

Mesab67:

The entire basis of supercomputers is parallelism and thus, being able to split your workload over as many available cores (CPU & GPU) as possible. Would be good to have some of these experienced folks share some of their highly tuned workload-splitting skills with the outside world...there's many areas that could benefit. {edit: talking primarily about algorithm/scaling skills]

Parallelism isn't so much a people skill issue albeit Im sure in some cases that's true. It's mostly down to some tasks are terrible at going parallel and then the extra costs as it costs more to implement parallelism . Each time you go parallel you have to have a master thread that sits above all the parallel threads so there is some inherent overhead. Sometimes it not only does not improve performance you can actually degrade performance by going parallel due to this overhead. These supercomputers and even various GPU tasks like mining already have what you can call given parallelism which are very easy to code. The things I think you want are the found parallelism where you have to put a lot of thought into can I do these things together or can I split these things up into parallel subtasks. This really takes great programmers that have deep knowledge of the software that is to be tweaked to use parallelism.

#5761873

D3M1G0D

2020-02-19 19:53

Undying:

But can it run crysis? Haah, sry i had to. 😀

Well, seeing as how the 3990X can run Crysis without a GPU, the answer is an emphatic YES! 😀

#5761910

Gomez Addams

2020-02-20 00:48

Brasky:

I'm glad someone was able to do that math. Definitely sounds like a pretty small footprint compared to the ones that need an entire facility built to house the SC.

The calculations are pretty simple. Here are the numbers : 290,304 cores divided by 64 gives the number chips = 4536. That number divided by 8 gives the number of boxes = 567. From there is some guessing about how the racks are organized. In the ones I have read about, they typically have one CPU box manage one storage box which will contain hundreds of TBs of storage, if not PBs, and a very high performance interconnection, like about PCI v8 speeds, to extrapolate a bit. Here is the article about the Shasta architecture : https://www.anandtech.com/show/13616/managing-16-rome-cpus-in-1u-crays-shasta-direct-liquid-cooling. If they have gone away from 8 CPUs per 1U box then the number of boxes will change, obviously, but I don't think they have.

#5761954

Gomez Addams

2020-02-20 05:55

I was thinking about this a bit more and it occurred to me they could take this route with the GPUs : https://i.servimg.com/u/f35/17/98/38/10/th/img_0810.jpg Given Cray/HPE's resources and the costs involved this is a good possibility. This design would have much higher throughput than PCIe because it uses NvLink. That is 8 V100 GPUs in a 3U box. For 112 GPUs, 14 boxes would be required in at least two racks. Those 14 boxes would occupy 42U and CPUs and interconnections are also needed so they might use three. On the other hand, they are liquid cooling the CPUs so why not the GPUs too? That would change the packaging considerably since they are at 1U on the CPUs.