Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
Hitman III: PC graphics perf benchmark review
TeamGroup CX2 1TB SATA3 SSD review
EVGA GeForce RTX 3070 FTW3 Ultra review
Corsair 5000D PC Chassis Review
NZXT Kraken X63 RGB Review
ASUS Radeon RX 6900 XT STRIX OC LC Review
TerraMaster F5-221 NAS Review
MSI Radeon RX 6800 XT Gaming X TRIO Review
Sapphire Radeon RX 6800 NITRO+ review
Corsair HS70 Bluetooth Headset Review

New Downloads
SiSoft Sandra 20/20 download v30.92
AMD Radeon Adrenalin Edition 21.1.1 driver download
CPU-Z download v1.95
Intel HD graphics Driver Download Version: DCH 27.20.100.9168
HWiNFO Download v6.41 (4355 Beta)
GeForce 461.33 hotfix driver download
Prime95 download version 30.4 build 7
AIDA64 Download Version 6.32.5620 beta
3DMark Download v2.16.7117 + Time Spy
Crystal DiskMark 8.0.1 Download


New Forum Topics
EU fines Valve and 5 other gaming companies for geo-blocking PC games AMD Radeon 21.1.1 drivers confirmed to bring Radeon cards a massive boost in Hitman III Will the 1st driver of 2021 be more then a common update? NVIDIA eliminates Max-Q and Max-P labels from its GeForce Mobile products December 2020 Guru3D Contest Winner Announcements 3090 Owner's thread Rainmeter plugin for MSI Afterburner NVIDIA Profile Inspector 2.3.0.13 GeForce Hotfix Driver Version 461.33 Solution for stuck VRAM mem at max clocks on AMD Navi10




Guru3D.com » Review » EVGA GeForce GTX 670 SC review » Page 5

EVGA GeForce GTX 670 SC review - The graphics architecture that is Kepler

by Hilbert Hagedoorn on: 05/11/2012 01:00 PM [ 1] 0 comment(s)

Tweet

 

The graphics architecture that is Kepler

As you can understand, the massive memory partitions, bus-width and combination of GDDR5 memory (quad data rate) allow the GPU to work with a very high framebuffer bandwidth (effective). Let's again put most of the data in a chart to get an idea and better overview of changes:

Graphics card GeForce GTX 580 GeForce GTX
670
EVGA GeForce GTX 670 SC GeForce GTX 680 GeForce GTX 690
Fabrication node 40nm 28nm 28nm 28nm 28nm
Shader processors 512 1344   1536 3072
Streaming Multiprocessors (SM) 16 7 7 8 16
Texture Units 64 112 112 128 128x2
ROP units 48 32 32 32 32x2
Graphics Clock (Core) 772 MHz 915 / 980MHz 967 / 1046 MHz 1006/1058MHz 915/1019MHz
Shader Processor Clock 1544 MHz 915 / 980MHz 967 / 1046 MHz 1006/1058MHz 915/1019MHz
Memory Clock / Data rate MHz 1000 / 4000 1502 / 6008 MHz 3105/ 6210 MHz 1502 / 6008 MHz 1502 / 6008 MHz
Graphics memory 1536 MB 2048 MB 2048 MB 2048 MB 4096 MB
Memory interface 384-bit 256-bit 256-bit 256-bit 256-bit
Memory bandwidth 192 GB/s 192 GB/s 199 GB/s 192 GB/s 192 GB/s
Power connectors 1x6-pin PEG, 1x8-pin PEG 2x6-pin PEG 2x6-pin PEG 2x6-pin PEG 2x8-pin PEG
Max board power (TDP) 244 Watts 170 Watts 170 Watts 170 Watts 300 Watts
Recommended Power supply 600 Watts 550 Watts 550 Watts 550 Watts 750 Watts
GPU Thermal Threshold 97 degrees C 98 degrees C 98 degrees C 98 degrees C 98 degrees C

So we talked about the core clocks, specifications and memory partitions. Obviously there's a lot more to talk through the GPU architecture for example. To understand a graphics processor you simply need to break it down into pieces to better understand it.

 Let's first look at the raw data that most of you can understand and grasp. This bit will be about the Kepler architecture, if you're not interested in g33k talk by all means please browse to the next page.

GeForce GTX 680

So above we see the GK104 block diagram that entails the Kepler architecture. Let's break it down into bits and pieces. A fully operating GK104 will have:

  • 1536 CUDA processors (Shader cores)
  • 192 CUDA core clusters (SM).
  • 8 geometry units
  • 4 raster Units
  • 128 Texture Units
  • 32 ROP engines
  • 256-bit GDDR5 memory bus
  • DirectX 11.1

Above thus a fully operating GK104 as used on the GTX 680. The GTX 670 uses the same chip, but has one SM (CUDA / Shader core cluster) disabled. So the more important thing to focus on are the SM (block of shader processors) clusters (or SMX as NVIDIA likes to call it for the GTX 680, which  has 192 Shader processors. That's radically different from Fermi, the GeForce GTX 580 for example had 32 shader processors per SM cluster. 1536 : 192 = 8 Shader clusters (SMs). Let's blow up one such cluster:

GeForce GTX 680

Above the block diagram for a single Shader processor cluster, aka SM or SMX as NVIDIA now calls it. The new SMX has quite a bit more bite in terms of shader, texture and geometry processing. 192 CUDA cores, that's six times the number of cores per SM opposed to Fermi. Now, at the end of the pipeline we run into the ROP (Raster Operation) engine and the GTX 680 again has 32 engines for features like pixel blending and AA.

There's a total of 128 texture filtering units available for the GeForce GTX 680. The math is simple here, each SM has 16 texture units tied to it.

  • GeForce GTX 580 has 16 SMs X 4 Texture units = 64
  • GeForce GTX 670 has 7 SMs X 16 Texture units = 112
  • GeForce GTX 680 has 8 SMs X 16 Texture units = 128

Above the GK104 host interface - The Gigathread engine, four GPCs, four memory controllers, the ROP partitions, a 768 KB L2 cache. Each GPC has eight polymorph engines - ROP partitions are nearby to the L2 cache, Each shader cluster then is tied to L1 and a shared L2 cache. Shading performance is going be increased quite bit, geometry performance will get a nice boost as well. NVIDIA is using 64KB Shared Memory/L1 per SMX – please note that they have a 16/48 – 48/16 ratio here for graphics/compute, as before with Fermi. For L2, 128KB per 64-bit memory controller. So that adds up to 512KB L2

In regards to architectural changes, on top of the pipeline NVIDIA has now added new Polymorph 2.0 (world space processing) engines and raster (screen space processing) engines, they act like a mini CPU really.




24 pages « < 4 5 6 7 next »



Related Articles
EVGA GeForce RTX 3070 FTW3 Ultra review
We review and benchmark a custom design GeForce RTX 3070, this round the EVGA GeForce RTX 3070 FTW3 in specific in an Ultra edition. We'll plant this card in our test system to see how well it perfo...

EVGA GeForce GTX 1080 FTW2 review
We check out the EVGA GeForce GTX 1080 FTW2 iCX  8G, and heck yeah this is not your regular one. Meet the all custom, cooled and tweaked EVGA For The Win2 Edition. This SKU is a more premium version...

EVGA GeForce GTX 1070 SC Gaming review
We review the EVGA GeForce GTX 1070 SC Gaming armed with 8GB GDDR5 graphics memory. Now we all like the reference founders edition cards, but let's face it, aren't the proper board partner cards so...

EVGA GeForce GTX 780 Ti SC SuperClocked ACX Review
In this review we test the EVGA GeForce GTX 780 Ti SC SuperClocked ACX review, armed with that 450W cooler and our FLIR camera we'll see if it really is good cooling. Oh and hey, SC means a factory o...

© 2021