Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
G.Skill TridentZ 5 RGB 6800 MHz CL34 DDR5 review
Be Quiet! Dark Power 13 - 1000W PSU Review
Palit GeForce RTX 4080 GamingPRO OC review
Core i9 13900K DDR5 7200 MHz (+memory scaling) review
Seasonic Prime Titanium TX-1300 (1300W PSU) review
F1 2022: PC graphics performance benchmark review
MSI Clutch GM31 Lightweight​ (+Wireless) mice review
AMD Ryzen 9 7900 processor review
AMD Ryzen 7 7700 processor review
AMD Ryzen 5 7600 processor review

New Downloads
CPU-Z download v2.04
Intel ARC graphics Driver Download Version: 31.0.101.4090
AMD Radeon Software Adrenalin 23.1.2 (RX 7900) download
GeForce 528.24 WHQL driver download
Display Driver Uninstaller Download version 18.0.6.0
Download Intel network driver package 27.8
ReShade download v5.6.0
Media Player Classic - Home Cinema v2.0.0 Download
HWiNFO Download v7.36
MSI Afterburner 4.6.5 (Beta 4) Download


New Forum Topics
528.24 - Clean Version Amernime Zone AMD Software: Adrenalin / Pro Driver - Release Discovery 22.12.2 WHQL GeForce 528.24 WHQL driver download Export and Share curve OC profiles for MSI AB (suggestion) NVIDIA GeForce 528.24 WHQL driver download & Discussion 7900XT Increased post time Nvidia 528.34 driver Vulkan Intel Shares Fourth-Quarter and Full-Year 2022 Financial Results Resizable Bar working on intel's 3rd Gen Ivy Bridge from 2012 AMD Software: Adrenalin Edition 23.1.2 for AMD Radeon™ RX 7900 Series




Guru3D.com » Review » Nvidia GeForce GTX Titan-Z review » Page 6

Nvidia GeForce GTX Titan-Z review - Kepler GK110 Revision B Graphics Architecture

by Hilbert Hagedoorn on: 06/20/2014 08:37 AM [ 3] 98 comment(s)

Tweet

Kepler GK110 Revision B Graphics Architecture

As you can understand, the massive memory partitions, bus-width and combination of GDDR5 memory (quad data rate) allows the GPU to work with a very high framebuffer bandwidth (effective). Let's again put most of the data in a chart to get an idea and better overview of changes:

Graphics card GeForce GTX 680 GeForce GTX 780 GeForce GTX Titan GeForce GTX 780 Ti GeForce GTX Titan Black GeForce GTX Titan Z
Fabrication node 28nm 28nm 28nm 28nm 28nm 28nm
Shader processors 1536 2304 2688 2880 2880 5760
Streaming Multiprocessors (SMX) 8 12 14 15 15 15x2
Texture Units 128 192 224 240 240 240x2
ROP units 32 48 48 48 48 48x2
GPU Clock (Core/Boost)* 1006/1058 863/900 836/876 875/928 889/980 705/876
Memory Clock / Data rate* 1502/6008 1502/6008 1502/6008 1750/7000 1750/7000 1750/7000
Graphics memory 2048 MB 3072 MB 6144 MB 3072 MB 6144 MB  
Memory interface 256-bit 384-bit 384-bit 384-bit 384-bit 384-bit
Memory bandwidth 192 GB/s 288 GB/s 288 GB/s 336 GB/s 336 GB/s 336 GB/s
Power connectors  2x6-pin PEG 1x6-pin PEG, 1x8-pin PEG 1x6-pin PEG, 1x8-pin PEG 1x6-pin PEG, 1x8-pin PEG 1x6-pin PEG, 1x8-pin PEG  2x8-pin PEG
Max board power (TDP) 170 Watts 250 Watts 250 Watts 250 Watts 250 Watts 450 Watts
Recommended Power supply 550 Watts 600 Watts 600 Watts 600 Watts 600 Watts 850 Watts
GPU Thermal Threshold 98 degrees C 95 degrees C 95 degrees C 95 degrees C 95 degrees C 95 degrees C

So we talked about the core clocks, specifications and memory partitions. Obviously there's a lot more to talk through. We feel that to be able to understand a graphics processor, you simply need to break it down into small pieces to better understand it. Let's first look at the raw data that most of you can understand and grasp. This bit will be about the Kepler GK110B architecture, if you're not interested in g33k talk, by all means please browse to the next page.

Right, so have a close look at the GK110 die as shown above. You'll notice the five green clusters. These are the polymorph GPC engines, each containing 3 SMX (streaming multi processor) clusters, 5 x 3 = 15 SMX clusters in total. You'll spot six 64-bit memory interfaces, bringing in a 384-bit path towards the graphics memory. That's instant extra memory bandwith by the way, combined with a 7 Gbps clock, the cards can reach 336 GB/sec.
 

 


So above, we see the GK110 block diagram that entails Kepler architecture. Let's break it down into bits and pieces. The GK110B will have:

  • 5760 (GTX Titan Z), 2880 (GTX 780 Ti), 2688 (Titan) or 2304 (GTX 780) CUDA processors (Shader cores)
  • There are 192 CUDA cores (shader processors) per cluster (SMX).  
The more important thing to focus on are the SM (block of shader processors) clusters (SMX), which has 192 Shader processors. 
 
 
SMX: 192 single‐precision CUDA cores, 64 double‐precision units, 32 special function units (SFU), and 32 load/store units.
 

When we zoom in even further at one SMX cluster (192 shader processors) we see a change from the GK104 (GTX 680) as there are 64 double-precision math units. See the GeForce GTX 680 SMX had 192 single-precision (SP) floating point CUDA Cores, and 8 double-precision (DP) CUDA cores. As a result, DP operations per clock ran at effectively 1/24 the SP rate. This is the same for GTX 780 Ti. The one exception remains the GTX Titan, it includes a full 64 DP CUDA Cores per SMX (compared to 192 SP CUDA Cores), or 1/3rd the number of DP cores to SP for substantially more double-precision horsepower. So based on a full 15 SMX 2880 shader cores chip the GK110 has 960 DP units linked to its total of 2880 CUDA cores, that would be 896 DP units on tested GTX 780 with 12 activated SMXes. Double precision wise, to unlock full performance, you must open the Nvidia Control Panel and navigate to “Manage 3D Settings”. In the Global Settings box you will find an option titled “CUDA – Double Precision” which needs to be enabled, but... GeForce GTX Titan and GTX 780 will run at reduced clock speeds when full double-precision is enabled. Still a great option if you are working on CUDA applications.
 
The SMX has quite a bit more bite in terms of shader, texture and geometry processing. For GeForce GTX 780 Ti 192 CUDA cores, that's six times the number of cores per SM opposed to Fermi. In the pipeline we run into the ROP (Raster Operation) engine and the GK110 has 48 engines for features like pixel blending and AA. The GK110 has 64KB of L1 cache for each SMX plus a special 48KB texture unit memory that can be utilized as a read-only cache. L2 cache wise things remain the same across the SMX units compared to the GK104, 1.5 MB. The GPU’s Texture units are a valuable resource for compute programs with a need to sample or filter image data. The texture throughput in Kepler is significantly increased compared to Fermi – each SMX unit contains 16 texture filtering units.
  • GeForce GTX 580 has 16 SMX x 4 Texture units = 64
  • GeForce GTX 680 has 8 SMX x 16 Texture units = 128
  • GeForce GTX 780 has 12 SMX x 16 Texture units = 192
  • GeForce GTX Titan has 14 SMX x 16 Texture units = 224
  • GeForce GTX 780 Ti has 15 SMX x 16 Texture units = 240
  • GeForce GTX Titan black has 15 SMX x 16 Texture units = 240
  • GeForce GTX Titan Z has 2x (15 SMX x 16 Texture units) = 2x 240

So there's a total 15 SMX x16 TU = 240 texture filtering units available for the GK110 silicon itself (once all SMXes were enabled). Still with me?




30 pages « < 5 6 7 8 next »



Related Articles
NVIDIA GeForce RTX 2080 SUPER review
We review the GeForce RTX 2080 SUPER, NVIDIA has launched a new Super graphics cards, as in super-charged in a super range of what they deem super products. GeForce RTX 2080 Super is based on a Turin...

Nvidia GeForce GTX 1070 review
In this review we test the GeForce GTX 1070 (Nvidia Founders Edition). The 8 GB graphics card is the somewhat limited little brother of the GTX 1080, this little demon on the Pascal architecture and 1...

Nvidia GeForce GTX 1080 review
We review the all new Nvidia GeForce GTX 1080 (founders edition). The new 8GB beast based on the Pascal architecture and 16nm FinFET has arrived. It's cool, it's silent and it rocks hard when it com...

Nvidia GeForce GTX 980 Ti Review
In this review we look deeply into the GeForce GTX 980 Ti. Everything you heard is true, this product is based on BIG Maxwell, the same GPU that is powering the Titan X. Obviously the product has been...

© 2023