Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
Hitman III: PC graphics perf benchmark review
TeamGroup CX2 1TB SATA3 SSD review
EVGA GeForce RTX 3070 FTW3 Ultra review
Corsair 5000D PC Chassis Review
NZXT Kraken X63 RGB Review
ASUS Radeon RX 6900 XT STRIX OC LC Review
TerraMaster F5-221 NAS Review
MSI Radeon RX 6800 XT Gaming X TRIO Review
Sapphire Radeon RX 6800 NITRO+ review
Corsair HS70 Bluetooth Headset Review

New Downloads
GeForce 461.33 hotfix driver download
Prime95 download version 30.4 build 7
AIDA64 Download Version 6.32.5620 beta
3DMark Download v2.16.7117 + Time Spy
Crystal DiskMark 8.0.1 Download
Corsair Utility Engine Download (iCUE) Download v3.37.140
ReShade download v4.9.1
GeForce 461.09 WHQL driver download
Intel HD graphics Driver Download Version: DCH 27.20.100.9126
HWiNFO Download v6.41–4345 Beta


New Forum Topics
[Solved] RX480/RX580 AMD drivers causing cursor/mouse lag New GPU or wait? 3090 Owner's thread December 2020 Guru3D Contest Winner Announcements GeForce Hotfix Driver Version 461.33 Radeon Adrenalin Edition 20.12.1 driver download & disccussion There won't be a Founders Edition of the GeForce RTX 3060 Review: Hitman III: PC graphics performance benchmark analysis AMD must be starting to catch up 5800X at RRP and no Far Cry 6




Guru3D.com » Review » ASUS ARES II review » Page 3

ASUS ARES II review - The graphics engine architecture

by Hilbert Hagedoorn on: 01/28/2013 05:53 PM [ 5] 42 comment(s)

Tweet

 

The graphics engine architecture

AMD moved away from the VLIW5 and VLIW4 architecture we have seen in the last generation of products. If anything, VLIW4 has shown certain inefficiencies in the Radeon HD 6900 series and while VLIW designs are fine for graphics they are not so grand for computing.

The latest graphics core architecture is now marketed as GCN, which is short for Graphics Core Next architecture and the architecture building block has changed significantly to remove certain inefficiencies seen in the VLIW architecture.

A GCN is in its essence the basis of a GPU that performs well at both graphical and computing tasks. For the compute side of things the new GCN Compute unit model has been introduced, it is designed for better utilization, high throughput and multi tasking. E.g. performance, performance, performance.

So your basic new Shader cluster is one called a (GCN) Compute Unit:

  • Non-VLIW Design
  • 16 wide SIMD Units
  • 64 KB registers / SIMD Unit

Now if we take 4 of these SIMD Units, that will form the basis of one Compute Unit (CU). Each SIMD unit is 16 wide, times four per compute unit means that each CU unit has 64 shader processors. The GPU has 32 Compute units meaning 64SIMDs x 32 CUs = 2048 Shader processors (for the R7970).

  • Engine has Dual Geometry engines / Asynchronous Compute engines
  • 8 render backends / 32 color ROPs per clock cycle / 128 Z/Stencil ROPs per clock
  • Engine ties to 768KB R/W L2 cache
  • Tahiti series GPUs have up-to 32 Compute Units

The Graphics Core Next Compute Unit (CU) has about the same floating point power per clock as the previous one (i.e. Cayman). It also has the same amount of register space (for the vector units). Each CU also has its own registers and local data share.

Again: one compute unit just as a Cayman SIMD is a collection of shader processors, four SIMDs form one compute unit. Cayman's (6900) problem was that it was not so efficient with multiple tasks at once.

Cayman had/has 16 4-wide VLIW processing elements for a total of 16x4=64 operations in parallel, while the new architecture has 4 16-wide vector processors, again for a total of 4x16=64 operations per clock. GCN also has a scalar processor that Cayman does not.

The distinction is in its bare essence that GCN does not need instruction level parallelism, each of the four 16-wide SIMD vector units execute a different wavefront being the whole 64-sized wavefront taking four cycles.

Radeon HD 7970

So the theoretical floating point power stays more or less the same per CU, but GCN will be more efficient since it does not require instruction level parallelism (we assume it costs some more area/transistors as well). The outcome, compiling also becomes much more uncomplicated and that means more efficiency and thus there it is again, better performance.

GCN is all about creating a GPU good for both graphics and computing purposes. Oh and all compute units... combined with the other ASIC components form the GPU. See, easy peasy, right? :)




29 pages « 2 3 4 5 next »



Related Articles
ASUS ARES II review
We test and review the ASUS ARES II as single card and in Crossfire today. The ARES 2 is a dual-GPU Radeon HD 7970 graphics card. Fully customized with 3rd party Liquid cooling. We test the product one one and three monitors in Eyefinity with the hottest games like Battlefield 3, Sleeping Dogs, Far Cry 3, Medal of Honor Warfighter, Hitman Absolution and many more.

ASUS ARES Review
We test and review the worlds fastest single Graphics card. These uber-high-enthusiast targeted products are intended to create a lot of buzz and potentially have a lot of marketing value. But face fact is also that there is a small group of end-users actually really interested it in, regardless of price and deficits. So with this round of realizing something fun, extra ordinary and sure prices very steep ASUS went back to the drawing board. They came up with a dual-GPU design solution based off Radeon 5970, but an overall better design, new PCB, higher clock frequencies on GPUs and more memory (2GB per GPU). Then they threw improved voltage regulation management into the mix and added a new cooler with the weight of a small baby on top of the GPUs to deliver something really special.

© 2021