Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
be quiet Pure Loop 2 FX 280mm LCS review
HP FX900 1 TB NVMe Review
Scythe FUMA2 Rev.B CPU Cooler review
SK Hynix Platinum P41 2TB M.2 NVMe SSD Review
Corsair K70 RGB PRO Mini Wireless review
MSI MPG A1000G - 1000W PSU Review
Goodram IRDM PRO M.2 SSD 2 TB NVMe SSD Review
Samsung T7 Shield Portable 1TB USB SSD review
DeepCool LS720 (LCS) review
Fractal Design Pop Air RGB Black TG review

New Downloads
Display Driver Uninstaller Download version 18.0.5.4
FurMark Download v1.31
Intel HD graphics Driver Download Version: 31.0.101.3222
Intel ARC graphics Driver Download Version: 30.0.101.1743
AMD Radeon Software Adrenalin 22.7.1 driver download
GeForce 516.93 WHQL Studio driver download
Corsair Utility Engine Download (iCUE) Download v4.26.110
ReShade download v5.3.0
AIDA64 Download Version 6.75
7-Zip v22.01 Download


New Forum Topics
MSI G2422 24-inch full HD LCD for eSports, compatible with FreeSync Premium Intel Core i9-13900K with and without power management settings [3rd-Party Driver] Amernime Zone Radeon Release Nemesis 22.6.1 WHQL DriverPack (22.7.1 pending ...) Latest Nvidia Drivers trigger multiple WMI Errors on PC startup and wake up! At Least One AIB Halts Production ARC Graphics cards - Is ARC Doomed? MSI Shows Custom Intel Arc A380 Graphics Card EK Releases New Delta2 TEC for LGA 1700 and Intel 12th Gen Core CPUs NVIDIA Studio Driver 516.93 WHQL 760 GTX Need Attention-Scrap Razer released gaming chairs inspired by Williams and Koenigsegg (at 1299 USD)




Guru3D.com » Review » ASUS GeForce GTX 780 DirectCU II OC review » Page 5

ASUS GeForce GTX 780 DirectCU II OC review - Kepler GK110-400 graphics architecture

by Hilbert Hagedoorn on: 06/14/2013 11:45 AM [ 4] 10 comment(s)

Tweet

 

Kepler GK110-400 graphics architecture

As you can understand, the massive memory partitions, bus-width and combination of GDDR5 memory (quad data rate) allow the GPU to work with a very high framebuffer bandwidth (effective). Let's again put most of the data in a chart to get an idea and better overview of changes:

Graphics card GeForce GTX 480 GeForce GTX 580 GeForce GTX 680 GeForce GTX 780 GeForce GTX Titan
Fabrication node 40nm 40nm 28nm 28nm 28nm
Shader processors 480 512 1536 2304 2688
Streaming Multiprocessors (SMX) 15 16 8 12 14
Texture Units 60 64 128 192 224
ROP units 48 48 32 48 48
Graphics Clock (Core) 700 MHz 772 MHz 1006/1058 MHz 863/900 MHz 836/876 MHz
Shader Processor Clock 1401 MHz 1544 MHz 1006/1058 MHz 863/900 MHz 836/876 MHz
Memory Clock / Data rate 924 MHz / 3696 MHz 1000 MHz / 4000 MHz 1502 MHz / 6008 MHz 1502 MHz / 6008 MHz 1502 MHz / 6008 MHz
Graphics memory 1536 MB 1536 MB 2048 MB 3072 MB 6144 MB
Memory interface 384-bit 384-bit 256-bit 384-bit 384-bit
Memory bandwidth 177 GB/s 192 GB/s 192 GB/s 288 GB/s 288 GB/s
Power connectors 1x6-pin PEG, 1x8-pin PEG 1x6-pin PEG, 1x8-pin PEG  2x6-pin PEG 1x6-pin PEG, 1x8-pin PEG 1x6-pin PEG, 1x8-pin PEG
Max board power (TDP) 250 Watts 244 Watts 170 Watts 250 Watts 250 Watts
Recommended Power supply 600 Watts 600 Watts 550 Watts 600 Watts 600 Watts
GPU Thermal Threshold 105 degrees C 97 degrees C 98 degrees C 95 degrees C 95 degrees C

So we talked about the core clocks, specifications and memory partitions. Obviously there's a lot more to talk through. We feel that to be able to understand a graphics processor, you simply need to break it down into small pieces to better understand it. Let's first look at the raw data that most of you can understand and grasp. This bit will be about the Kepler GK110 architecture, if you're not interested in geek talk, by all means please browse to the next page.

Right so have a close look at the GK110 die as shown above. You'll notice the five green clusters. These are the polymorph GPC engines, each containing 3 SMX clusters, 5 x 3 = 15 SMX clusters in total. You'll spot six 64-bit memory interfaces, bringing in a 384-bit path towards the graphics memory. That's instant extra memory bandwith by the way, combined with a 6 Gbps clock, the cards can reach 288 GB/sec.
 

 


So above we see the GK110 block diagram that entails Kepler architecture. Let's break it down into bits and pieces. The GK110-400 will have:

  • 2688 (Titan) or 2304 (GTX 780) CUDA processors (Shader cores)
  • 192 CUDA cores per cluster (SMX).  
The more important thing to focus on are the SM (block of shader processors) clusters (or SMX as Nvidia likes to call it for the GTX 600 series, which has 192 Shader processors. 
 
 
SMX: 192 single‐precision CUDA cores, 64 double‐precision units, 32 special function units (SFU), and 32 load/store units.
 

When we zoom in ever further at one SMX cluster (192 shader processors) we see a change change from the GK104 (GTX 680) as there are 64 double-precision math units.
 
See the GeForce GTX 680 SMX had 192 single-precision (SP) floating point CUDA Cores, and 8 double-precision (DP) CUDA cores. As a result, DP operations per clock ran at effectively 1/24 the SP rate. For GTX Titan, it includes a full 64 DP CUDA Cores per SMX (compared to 192 SP CUDA Cores), or 1/3rd the number of DP cores to SP for substantially more double-precision horsepower. So based on a full 15 SMX 2880 shader cores chip the GK110 has 960 DP units linked to its total of 2,880 CUDA cores, that would be 896 DP units on tested GTX 780 with 12 activated SMXes. Double precision wise, to unlock full performance, you must open the Nvidia Control Panel, navigate to “Manage 3D Settings”. In the Global Settings box you will find an option titled “CUDA – Double Precision” which needs to be enabled, but... GeForce GTX Titan and GTX 780 will run at reduced clock speeds when full double-precision is enabled. Still a great option if you are working on CUDA applications.

The SMX has quite a bit more bite in terms of shader, texture and geometry processing. 192 CUDA cores, that's six times the number of cores per SM opposed to Fermi. In the pipeline we run into the ROP (Raster Operation) engine and the GK110 has 48 engines for features like pixel blending and AA. The GK110 has 64KB of L1 cache for each SMX plus a special 48KB texture unit memory that can be utilized as a read-only cache. L2 cache wise things remain the same across the SMX units compared to the GK104, 1.5MB. The GPU’s Texture units are a valuable resource for compute programs with a need to sample or filter image data. The texture throughput in Kepler is significantly increased compared to Fermi – each SMX unit contains 16 texture filtering units.

  • GeForce GTX 580 has 16 SMX x 4 Texture units = 64
  • GeForce GTX 680 has 8 SMX x 16 Texture units = 128
  • GeForce GTX 780 has 12 SMX x 16 Texture units = 192
  • GeForce GTX Titan has 14 SMX x 16 Texture units = 224

So there's a total 15 SMX x16 TU = 240 texture filtering units available for the GK110 silicon itself (if all SMXes where enabled). Still with me?




27 pages « < 4 5 6 7 next »



Related Articles
ASUS GeForce RTX 3080 Noctua OC review
Enjoy the silence, since who doesn't remember that tune from the 1980s? Join us as we analyze the all new GeForce RTX 3080 Noctua OC model. You can dispute its appearance and style, but the card perf...

ASUS GeForce RTX 3090 Ti TUF Gaming review
It's been boiling for a while, a GeForce RTX 3090 'Ti'. The 3090 flagship series now has quicker memory, more shaders, and a TGP of 450-500 Watts. In this review, we benchmark the GeForce RTX 309...

ASUS GeForce RTX 3070 Noctua OC review
Enjoy the silence, who doesn't remember that song from the 80'ties. Join us as we evaluate a GeForce RTX 3070 Noctua OC variant, which has been introduced onto the market recently. You can argue it...

ASUS GeForce RTX 3060 STRIX Gaming OC review
We move to ASUS, which outs their ROG GeForce RTX 3060 STRIX Gaming OC, with 12GB, 3584 shading processors activated and a boost clock of 1882 MHz the card has been tweaked extensively straight out of...

© 2022