Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
Corsair QL120 and QL140 RGB fan review
Promo: Windows 10 Pro for $13 With Office 2016 For $33
Corsair Void RGB Elite Wireless Headset review
Team Group PD400 Portable SSD review
AMD Athlon 3000G review
Team Group T-Force Delta Max 1 TB SSD review
Guru3D Rig of the Month - November 2019
ASUS ROG Rampage VI Extreme Encore review
Toshiba RC500 500GB NVMe M.2 SSD review
Promo: URcdkey Black Friday Windows 10 Pro for $12

New Downloads
3DMark Download v2.11.6846 + Port Royale
HWiNFO64 Download v6.20
AMD Radeon Adrenalin Edition 19.12.1 driver download
Crystal DiskMark Download v7.0.0f
AMD Ryzen Master Utility Download v2.1.0.1424
Quake II RTX Download v1.2
GeForce 441.41 WHQL driver download
Oculus TrayTool download v0.86.3.1
AMD Chipset Drivers Download v1.11.22.0454
Download AMD RAID Driver (SATA, NVMe RAID) v9.3.0.38


New Forum Topics
AMD Radeon Adrenalin Edition 19.12.1 driver download & discussion Display Driver Uninstaller Thread Ryzen 4000 and X670 scheduled for late 2020 The AMD Ryzen All In One Tread /Overclocking/Memory Speeds & Timings/Tweaking/Cooling Part 2 Review: Core i9 10980XE processor (Intel 18-core galore) Guru3D 2019 December 6th contest: Win a Ryzen 5 3600X (6c/12t) processor SanDisk SSD Plus Goes 2 TB on SATA3 Radeon RX 5500 XT photos leak: ASRock Challenger and Gigabyte Gaming OC 5G ON: T-Mobile 5G Network is Activated in the USA Radeon RE-Live: where backup the setting?




Guru3D.com » News » FMA4 instruction set hidden, but is working on AMD Zen processors

FMA4 instruction set hidden, but is working on AMD Zen processors

by Hilbert Hagedoorn on: 10/15/2018 10:06 AM | source: Level1Techs | 10 comment(s)
FMA4 instruction set hidden, but is working on AMD Zen processors

In an interesting find, it has been discovered that AMD processors based on ZEN architecture actually support the latest iteration of FMA, the FMA4-instruction set. The theory is that the FMA3 supplement instruction set would have been disabled for unknown reasons, however as it seems, it at the very least is partially working and active.

FMA is short for fused multiply-add and was added to the 2012 AMD FX series processors and have seen iteration changes leading up-tp FMA 3 and FMA4. FMA is a floating-point multiply-add operation performed in a single step, with a single rounding. It is the equivalent of the Intel AVX AVX instruction set, but more efficient and FMA4 should be really fast. Officially FMA4 is 33% faster than FMA3, however, it is not supported in the operating system, likely it was left disabled due to bugs or perhaps stability issues as hey, there is a primary reason for it to remain disabled.

  

  

As it now seems, Level1Techs tested this with Zen processors by running an adapted script that sends FMA4 instructions to the processor. The FMA4 task fired off at the processor surprisingly did not get refused and got executed successfully. It's an interesting find. Meanwhile, CPUID still states it is not supported/detected.

  



FMA4 instruction set hidden, but is working on AMD Zen processors FMA4 instruction set hidden, but is working on AMD Zen processors




Rate this story
Rating:

« Numerous fake Nvidia GeForce GTX 1060 graphics cards sold on Ebay · FMA4 instruction set hidden, but is working on AMD Zen processors · Guide: Guru3D PC Buyers Guide Autumn 2018 »

nevcairiel
Senior Member



Posts: 618
Joined: 2015-05-19

#5596332 Posted on: 10/15/2018 11:13 AM
In real-world code there really is no huge performance difference between FMA3 and FMA4, certainly not anything on the scale of 33%. Not sure where that number even comes from. 4 = 3 + 33%? :p

The comparisons in Wendells video are of AVX vs FMA4, not accounting for FMA3.

RzrTrek
Senior Member



Posts: 2352
Joined: 2012-04-16

#5596333 Posted on: 10/15/2018 11:14 AM


There's the full in depth video on the subject. I just love the guys over at Level1Techs.

They're very open and knowledgeable. Also their weekly news episodes makes me giggle like a girl.

Hilbert Hagedoorn
Don Vito Corleone



Posts: 36460
Joined: 2000-02-22

#5596335 Posted on: 10/15/2018 11:39 AM
In real-world code there really is no huge performance difference between FMA3 and FMA4, certainly not anything on the scale of 33%. Not sure where that number even comes from. 4 = 3 + 33%? :p

The comparisons in Wendells video are of AVX vs FMA4, not accounting for FMA3.

No, FMA4 has 33% higher throughput, because it processes four operands per instruction instead of three.

BLEH!
Senior Member



Posts: 5898
Joined: 2010-10-17

#5596336 Posted on: 10/15/2018 11:40 AM
So Zen could potentially be even faster than it already is???

asturur
Senior Member



Posts: 428
Joined: 2010-05-12

#5596337 Posted on: 10/15/2018 11:44 AM
in very narrow and particular scenarios where you need tons of multiply add and someone writing software enabled this feature, yes

Fediuld
Senior Member



Posts: 108
Joined: 2016-10-04

#5596340 Posted on: 10/15/2018 11:48 AM
So Zen could potentially be even faster than it already is???


AMD knows only why FMA4 isn't activated. Also is not instruction set but a single instruction doing calculations.
Maybe because Intel doesn't support FMA3, AMD decided not to "officially" do anything about this either. Or there are issues with the results returned on Ryzen 1xxxx series.

BLEH!
Senior Member



Posts: 5898
Joined: 2010-10-17

#5596345 Posted on: 10/15/2018 12:20 PM
AMD knows only why FMA4 isn't activated. Also is not instruction set but a single instruction doing calculations.
Maybe because Intel doesn't support FMA3, AMD decided not to "officially" do anything about this either. Or there are issues with the results returned on Ryzen 1xxxx series.

Could be any number of things.

nevcairiel
Senior Member



Posts: 618
Joined: 2015-05-19

#5596349 Posted on: 10/15/2018 12:24 PM
No, FMA4 has 33% higher throughput, because it processes four operands per instruction instead of three.


No, thats not how it works. Both FMA3 and FMA4 calculate the same thing, its always "a * b + c". 3 input operands. The only difference between FMA3 and FMA4 is where the result is being stored.
In FMA3, you have to store the result in one of the input operands, its called a "destructive" instruction, because it overwrites one of the inputs.
In FMA4, you have a separate output register, a 4th operand, so its "non-destructive".

If you wanted to simplify it, you could look at it like this:
FMA3: a = a * b + c (3 operands)
FMA4: d = a * b + c (4 operands)

The math is exactly the same, it saves you the same number of operations, only the location of the result differs.
This difference can have advantages in some algorithms, but in modern CPUs and the majority of algorithms, the impact from this difference is marginal at best, thanks to features like register renaming. An additional "move" instruction to copy the data into an additional register is often extremely cheap - if its even needed in the algorithm in question.

Alessio1989
Senior Member



Posts: 1407
Joined: 2015-06-11

#5596366 Posted on: 10/15/2018 01:16 PM
This is an old news. Moreover seems that using FMA4 under Zen gives wrong results: https://translate.google.it/translate?hl=it&sl=de&tl=en&u=http://www.planet3dnow.de/vbulletin/threads/421433-AMD-Zen-14nm-8-Kerne-95W-TDP-DDR4?p=5147746&viewfull=1

AMD knows only why FMA4 isn't activated. Also is not instruction set but a single instruction doing calculations.
Maybe because Intel doesn't support FMA3, AMD decided not to "officially" do anything about this either. Or there are issues with the results returned on Ryzen 1xxxx series.
Intel supports FMA3 since Haswell.

Astyanax
Senior Member



Posts: 3878
Joined: 2018-03-21

#5596479 Posted on: 10/15/2018 04:36 PM
This is an old news. Moreover seems that using FMA4 under Zen gives wrong results: https://translate.google.it/translate?hl=it&sl=de&tl=en&u=http://www.planet3dnow.de/vbulletin/threads/421433-AMD-Zen-14nm-8-Kerne-95W-TDP-DDR4?p=5147746&viewfull=1


Intel supports FMA3 since Haswell.

Very very old news.

https://www.agner.org/optimize/blog/read.php?i=838

news should be reworded to "in a video that could only be considered clickbaiting, level1techs has reported on something we already knew a year ago"

Post New Comment
Click here to post a comment for this news story on the message forum.


Guru3D.com © 2019