ASUS GeForce GTX 670 DirectCU II TOP review
Posted by Hilbert Hagedoorn on: 05/09/2012 01:00 PM [ 0 comment(s) ]
The graphics architecture that is Kepler
As you can understand, the massive memory partitions, bus-width and combination of GDDR5 memory (quad data rate) allow the GPU to work with a very high framebuffer bandwidth (effective). Let's again put most of the data in a chart to get an idea and better overview of changes:
|Graphics card||GeForce GTX 580||GeForce GTX
670 DCUII TOP
|GeForce GTX 680||GeForce GTX 690|
|Streaming Multiprocessors (SM)||16||7||7||8||16|
|Graphics Clock (Core)||772 MHz||915 / 980MHz||1058 / 1137 MHz||1006/1058MHz||915/1019MHz|
|Shader Processor Clock||1544 MHz||915 / 980MHz||1058 / 1137 MHz||1006/1058MHz||915/1019MHz|
|Memory Clock / Data rate MHz||1000 / 4000||1502 / 6008 MHz||1502 / 6008 MHz||1502 / 6008 MHz||1502 / 6008 MHz|
|Graphics memory||1536 MB||2048 MB||2048 MB||2048 MB||4096 MB|
|Memory bandwidth||192 GB/s||192 GB/s||192 GB/s||192 GB/s||192 GB/s|
|Power connectors||1x6-pin PEG, 1x8-pin PEG||2x6-pin PEG||2x6-pin PEG||2x6-pin PEG||2x8-pin PEG|
|Max board power (TDP)||244 Watts||170 Watts||180 Watts||170 Watts||300 Watts|
|Recommended Power supply||600 Watts||550 Watts||550 Watts||550 Watts||750 Watts|
|GPU Thermal Threshold||97 degrees C||98 degrees C||98 degrees C||98 degrees C||98 degrees C|
So we talked about the core clocks, specifications and memory partitions. Obviously there's a lot more to talk through the GPU architecture for example. To understand a graphics processor you simply need to break it down into pieces to better understand it.
Let's first look at the raw data that most of you can understand and grasp. This bit will be about the Kepler architecture, if you're not interested in g33k talk by all means please browse to the next page.
So above we see the GK104 block diagram that entails the Kepler architecture. Let's break it down into bits and pieces. A fully operating GK104 will have:
- 1536 CUDA processors (Shader cores)
- 192 CUDA core clusters (SM).
- 8 geometry units
- 4 raster Units
- 128 Texture Units
- 32 ROP engines
- 256-bit GDDR5 memory bus
- DirectX 11.1
Above thus a fully operating GK104 as used on the GTX 680. The GTX 670 uses the same chip, but has one SM (CUDA / Shader core cluster) disabled. So the more important thing to focus on are the SM (block of shader processors) clusters (or SMX as NVIDIA likes to call it for the GTX 680, which has 192 Shader processors. That's radically different from Fermi, the GeForce GTX 580 for example had 32 shader processors per SM cluster. 1536 : 192 = 8 Shader clusters (SMs). Let's blow up one such cluster:
Above the block diagram for a single Shader processor cluster, aka SM or SMX as NVIDIA now calls it. The new SMX has quite a bit more bite in terms of shader, texture and geometry processing. 192 CUDA cores, that's six times the number of cores per SM opposed to Fermi. Now, at the end of the pipeline we run into the ROP (Raster Operation) engine and the GTX 680 again has 32 engines for features like pixel blending and AA.
There's a total of 128 texture filtering units available for the GeForce GTX 680. The math is simple here, each SM has 16 texture units tied to it.
- GeForce GTX 580 has 16 SMs X 4 Texture units = 64
- GeForce GTX 670 has 7 SMs X 16 Texture units = 112
- GeForce GTX 680 has 8 SMs X 16 Texture units = 128
Above the GK104 host interface - The Gigathread engine, four GPCs, four memory controllers, the ROP partitions, a 768 KB L2 cache. Each GPC has eight polymorph engines - ROP partitions are nearby to the L2 cache, Each shader cluster then is tied to L1 and a shared L2 cache. Shading performance is going be increased quite bit, geometry performance will get a nice boost as well. NVIDIA is using 64KB Shared Memory/L1 per SMX please note that they have a 16/48 48/16 ratio here for graphics/compute, as before with Fermi. For L2, 128KB per 64-bit memory controller. So that adds up to 512KB L2
In regards to architectural changes, on top of the pipeline NVIDIA has now added new Polymorph 2.0 (world space processing) engines and raster (screen space processing) engines, they act like a mini CPU really.
In this article we review the ASUS GeForce GTX 670 DirectCU Mini edition, a compact performance graphics card designed primarily for small form factor PCs with mini ITX motherboards. The dual-slot card measures just 17cm and features the NVIDIA GTX 670 GPU. ASUS has re-engineered the DirectCU cooler to fit small form factor cases. While shorter, it introduces a copper vapor chamber placed directly on top of the GPU for faster heat spreading and dispersal with 20% lower temperatures than reference GTX 670.
ASUS GeForce GTX 660 Ti DirectCU II TOP review
In this review we'll test the GeForce GTX 660 Ti DCUII TOP from ASUS, it's their all new GeForce GTX 660 TOP version and admittedly to date is one of the most impressive graphics cards in the 660 Ti range we have tested.
ASUS GeForce GTX 660 DirectCU II TOP review
We review the ASUS GeForce GTX 660 DirectCU II TOP edition. It's factory overclocked pretty high, is pimped out and custom cooled. let's go have a look shall we ?
ASUS GeForce GTX 670 DirectCU II TOP review
We review the ASUS GeForce GTX 670 DirectCU II TOP edition. The DirectCU II TOP editions come factory overclocked pretty intensely towards 1058 MHz on the GPU base clock and a whopping 1137 MHz on the boost frequency. Even with that factory overclock, the card remains completely silent. Check out this review.