The product we test today, the Radeon HD 4870 is powered by a graphics chip (GPU). The codename for these chips is RV770. AMD put nearly a billion transistors into that GPU, which is now built upon a 55-nm (260 mm2 Die size) production. The chip literally is 16 mm wide and high. Which for AMD still is quite large, for a 55nm product. The number of transistors for a midrange product like this is extreme and typically it's best to directly relate that to the number of shader processors to get a better understanding. But first let's look at some nice examples of Die sizes of current architectures.
The Radeon 4850/4870 series graphics processor have 800 scalar processors (320 on the HD 3800 series) and now have a significant forty texture units (was 16 in last-gen architecture). The stream/compute/shader processors (can we please just name them all shader processors?) definitely had a good number of changes; if you are into this geek talk, you'll spot 10 SIMD clusters each carrying 80 32-bit Shader processors (this accumulates to 800). If I remember correctly, one SIMD unit can handle double precision.
Much like we recently noticed in the NVIDIA GTX 200 architecture, the 80 scalar stream processors per SIMD unit have 16KB of local data cache/buffer that is shared among the shader processors. Next to the hefty shader processor increase you probably already notice the massive amount of texture units. In the last generation product we noticed 16 units, the 4800 series has 40 units.
When you do some quick math, that's 2.5x the number of shader processors over the last-gen product, and 2.5x the number of texture units. That's a pretty grand change folks. Since the GPU has 800 shader processors it can produce the raw power of 1000 to 1200 GFlops in simple precision. It's a bit lame and inaccurate to do but divide the number of ATI's scalar shader processors with the number 5 and you'll roughly equal the performance to NVIDIA's stream processor. You could (in an abstract way) say that the 4800 series have 160 Shader units, if that helps you compare it towards NVIDIA's scaling. Again there's nothing scientific or objective about that explanation.
Effectively combined with the clock speed and memory this product can poop out 1000/1200 GigaFLOPs of performance. Depending on how that is measured of course. Let's compile a chart:
ATI Radeon HD 4850
ATI Radeon HD 4870
ATI Radeon HD 3850
ATI Radeon HD 3870
# of transistors
Stream Processing Units
2000 MHz GDDR3 (effective)
3600 MHz GDDR5 (effective)
1.66 GHz GDDR3 (effective)
2.25 GHz GDDR3 (effective)
Math processing rate (Multiply Add)
Power Consumption (peak)
When you look at the effective memory bandwidth of the 4870 you at the very least must go "wow". That memory bandwidth perfection is due to the Radeon HD 4870 using GDDR5 memory instead of GDDR3. Though the real clock frequency is 900 MHz, the outcome in effective bandwidth is the sustained data rate x4. See, the memory frequency is double that of double data rate. This will give the 4870 an astounding 115.2GB/s at 3600 MHz, while still being on the 256-bit memory bus. That in fact is more memory bandwidth than the GeForce GTX 260 (111.9 GB/s) which has a much wider 448-bit memory bus, but uses GDDR3.
Next to internal efficiency improvements we also stumble into an updated UVD engine (HD video decoder/accelerator/enhancer).
Palit Radeon HD 4870 Sonic Dual Edition review Every now and then there is a manufacturer out there that tries to do something special with a product. What's different then you likely ask for this Palit graphics card ? Well, the 1st thing you'll notice is a turbo switch, which allows you to select a second BIOS in an overclocked product state, we actually spot a display-port connector and next to that .. the USS enterprise actually was mounted on-top of the GPU as this product my friends, has one big cooler mounted on top of it.