Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
Scythe Mugen 5 Rev.C CPU Cooler review
be quiet Pure Loop 2 FX 280mm LCS review
HP FX900 1 TB NVMe Review
Scythe FUMA2 Rev.B CPU Cooler review
SK Hynix Platinum P41 2TB M.2 NVMe SSD Review
Corsair K70 RGB PRO Mini Wireless review
MSI MPG A1000G - 1000W PSU Review
Goodram IRDM PRO M.2 SSD 2 TB NVMe SSD Review
Samsung T7 Shield Portable 1TB USB SSD review
DeepCool LS720 (LCS) review

New Downloads
GeForce 516.94 WHQL driver download
Display Driver Uninstaller Download version 18.0.5.4
FurMark Download v1.31
Intel HD graphics Driver Download Version: 31.0.101.3222
Intel ARC graphics Driver Download Version: 30.0.101.1743
AMD Radeon Software Adrenalin 22.7.1 driver download
GeForce 516.93 WHQL Studio driver download
Corsair Utility Engine Download (iCUE) Download v4.26.110
ReShade download v5.3.0
AIDA64 Download Version 6.75


New Forum Topics
[3rd-Party Driver] Amernime Zone Radeon Release Nemesis 22.6.1 WHQL DriverPack (22.7.1 pending ...) Nvidia reduces revenue prediction due to video card demand. AMD Radeon Software - Preview Drivers - DCH/UWP 516.59 + Hotfix 516.79 - Clean Version FSR Thread Geforce 7300GS D256M Graphics Card - Windows 98SE The AMD Ryzen 7000 (Zen4) Series Retail Box Has Been Revealed ASUS ROG Swift PG32UQR" 32-inch 4K gaming LCD - 144Hz / 1ms DDR5-6000 Memory is the Sweet Spot For AMD Ryzen 7000 Zen 4 CPUs AMD video cards paired with AMD vs Intel?




Guru3D.com » News » Samsung Develops Industry First High Bandwidth Memory with AI Processing Power

Samsung Develops Industry First High Bandwidth Memory with AI Processing Power

by Hilbert Hagedoorn on: 02/17/2021 09:31 AM | source: | 4 comment(s)
Samsung Develops Industry First High Bandwidth Memory with AI Processing Power

Samsung Electronics, the world leader in advanced memory technology, today announced that it has developed the industry's first High Bandwidth Memory (HBM) integrated with artificial intelligence (AI) processing power—the HBM-PIM.

The new processing-in-memory (PIM) architecture brings powerful AI computing capabilities inside high-performance memory, to accelerate large-scale processing in data centers, high performance computing (HPC) systems and AI-enabled mobile applications.

Kwangil Park, senior vice president of Memory Product Planning at Samsung Electronics stated, "Our groundbreaking HBM-PIM is the industry's first programmable PIM solution tailored for diverse AI-driven workloads such as HPC, training and inference. We plan to build upon this breakthrough by further collaborating with AI solution providers for even more advanced PIM-powered applications."

Rick Stevens, Argonne's Associate Laboratory Director for Computing, Environment and Life Sciences commented, "I'm delighted to see that Samsung is addressing the memory bandwidth/power challenges for HPC and AI computing. HBM-PIM design has demonstrated impressive performance and power gains on important classes of AI applications, so we look forward to working together to evaluate its performance on additional problems of interest to Argonne National Laboratory."

Most of today's computing systems are based on the von Neumann architecture, which uses separate processor and memory units to carry out millions of intricate data processing tasks. This sequential processing approach requires data to constantly move back and forth, resulting in a system-slowing bottleneck especially when handling ever-increasing volumes of data.

Instead, the HBM-PIM brings processing power directly to where the data is stored by placing a DRAM-optimized AI engine inside each memory bank — a storage sub-unit—enabling parallel processing and minimizing data movement. When applied to Samsung's existing HBM2 Aquabolt solution, the new architecture is able to deliver over twice the system performance while reducing energy consumption by more than 70%. The HBM-PIM also does not require any hardware or software changes, allowing faster integration into existing systems.

Samsung's paper on the HBM-PIM has been selected for presentation at the renowned International Solid-State Circuits Virtual Conference (ISSCC) held through Feb. 22. Samsung's HBM-PIM is now being tested inside AI accelerators by leading AI solution partners, with all validations expected to be completed within the first half of this year.



Samsung Develops Industry First High Bandwidth Memory with AI Processing Power




« Corsair Acquires Visuals by Impulse · Samsung Develops Industry First High Bandwidth Memory with AI Processing Power · Microsoft releases 3D sound wireless headset for Windows / Xbox »

Related Stories

2TB version Samsung 980 Pro with 136 Layer (V-NAND v6) surfaces in webshops - 01/24/2021 10:46 AM
Samsung released its Samsung 980 Pro series in September last year, however only up to 1TB. Now a 2TB version has been spotted in etail....

Samsung Display to Introduce First 90Hz OLED Laptop Display - 01/21/2021 09:51 AM
Samsung Display Chief Executive Officer Joo Sun Choi said that the company will initially produce very large quantities of 14-inch, 90Hz OLED displays destined for laptops and notebooks, beginning in ...

Samsung Introduces Consumer SATA SSD Series, the 870 EVO - 01/19/2021 05:41 PM
Our sample got delayed due to shipments from the UK (yes actually Brexit-related), ergo no review today, but Samsung today announced the release of 870 EVO SSD (SATA3)....

Samsung Introduces ISOCELL HM3 with massive 108Mp Image Sensor for Smartphones - 01/19/2021 10:18 AM
Samsung today introduced its latest 108-megapixel (Mp) mobile image sensor, Samsung ISOCELL HM3. With a wide spectrum of advanced sensor technologies, the HM3 can capture sharper and more vivid images...

Samsung Brings the Ultimate Gaming Experience to 2021 Neo QLED and QLEDs - 01/18/2021 10:02 AM
Samsung Electronics announced that it is combining innovative new gaming features with exciting industry partnerships to create a groundbreaking gaming experience with 2021 Neo QLED and QLEDs. ...


Fox2232
Senior Member



Posts: 11809
Joined: 2012-07-20

#5888158 Posted on: 02/17/2021 01:28 PM
I wonder how that's secured. And I wonder how many more transistors are used by AI functions over memory cells.
(How much more expensive this is per GB. What are available capacities.)

In applications where there is use for it, I am sure it will pay itself off. But question is what kind of AI operations it can do. And how it will translate to Operations done by GPUs.
Where I expect it will be highly irrelevant unless scale becomes more important than complexity and latency.

nosirrahx
Senior Member



Posts: 410
Joined: 2013-04-05

#5888195 Posted on: 02/17/2021 03:23 PM

In applications where there is use for it, I am sure it will pay itself off. But question is what kind of AI operations it can do. And how it will translate to Operations done by GPUs.
Where I expect it will be highly irrelevant unless scale becomes more important than complexity and latency.

At the very least something like this could be implemented in AI upscaling or maybe even for using AI to generate entire intermediate frames. Having a chunk of very fast memory designed specifically for AI integrated into GPUs that are starting to handle AI tasks already seems like a natural progression. I can also see AI tricks to improve the look of older games. Imagine for an example an older game that was locked to 60FPS. Perhaps we will start seeing AI creating intermediary frames and increasing that to a fluid 120FPS without a single modification to the actual game.

waltc3
Senior Member



Posts: 1401
Joined: 2014-07-22

#5888324 Posted on: 02/17/2021 11:34 PM
Wake me when it ships... ;)

Fox2232
Senior Member



Posts: 11809
Joined: 2012-07-20

#5888591 Posted on: 02/18/2021 07:30 PM
At the very least something like this could be implemented in AI upscaling or maybe even for using AI to generate entire intermediate frames. Having a chunk of very fast memory designed specifically for AI integrated into GPUs that are starting to handle AI tasks already seems like a natural progression. I can also see AI tricks to improve the look of older games. Imagine for an example an older game that was locked to 60FPS. Perhaps we will start seeing AI creating intermediary frames and increasing that to a fluid 120FPS without a single modification to the actual game.

Problem is energy. In memory computing is so interesting, because it saves a lot of energy as data do not have to be moved into CPU/GPU for compatible tasks.

But when only part of math can be done in memory, data will have to be moved into GPU for rest. That completely defeats purpose of in memory math.
Same way as if you do a primary math in GPU (which means data were already read into GPU), and then moving results to memory for further processing and again back to GPU.

In case of frame interpolation for doubling frame rate, what is needed is detection of motion vectors. Identifying data blocks to move. And moving them appropriately. (Basically doing work of video encoder/decoder.)

If you could do that in memory, you'll have perfect product for streaming services. But would not that be same as keeping last few frames in GPU's cache. (Like Infinity Cache.) And let specialized video encoding/decoding HW do this job without ever moving it to video memory?

And I was really surprised many years ago when people started to talk about frame rate doubling in TVs, that we did not have this as feature for poor man in GPU.
Because in contrast to video stream, GPU has each frame available in lossless state.
There are media players which do it easy way. They take motion vector data, half it for given frame and just put resulting motion in between. Analyzing motion vectors from multiple upcoming and past frames may be even used for creating motion splines instead of mere vectors. That would provide more accurate method and ability to deliver even more frames in between accurately.

Now, while I do not like idea of fake frames, I am much less against doubling fps from 60 to 120, than I am against various upscaling methods.
(Especially AMD's lazy man's approach which they delivered by purchasing HiAlgo and half assedly butchering their Boost method in worst possible way. At least they delivered their Chill method in acceptable way.)
And I hope, that incoming update AMD makes for their Boost feature really changes most stupid approach you can think of.

Post New Comment
Click here to post a comment for this news story on the message forum.


Guru3D.com © 2022