Creative Labs 3D Blaster 5 FX5900 Review

Graphics cards 1048 Page 15 of 16 Published by

teaser

Page 15 - ShaderMark 2.0

ShaderMark 2.0DirectX 9 Pixel and Vertex Shader 2.0 performance

To measure pure DirectX 9 Shader 2.0 performance, we make use of ShaderMark 2.0 as supplied to us by Thomas Bruckschlegel, the programmer of this software. ShaderMark 2.0 is a DirectX 9.0 pixel shader benchmark. All pixel and vertex shader code is written in Microsofts High Level Shading Language. ShaderMark provides the possibility to use different compiler targets + advanced options. Currently there is no DirectX 9.0 HLSL pixel shader benchmark on the market. Futuremark's 3DMark03 (www.futuremark.com) and Massive's AquaMark 3.0 (www.aquamark.com) are bases on hand written assembler shaders or partly HLSL shaders. HLSL is the future of shader development! The HLSL shader compiler and its different profiles have to be tested and this gap fills ShaderMark v2.0. Driver cheating is also an issue. With ShaderMark, it is easily possible to change the underlying HLSL shader code (registered version only) which makes it impossible to optimize a driver for a certain shader, instead of the whole shader pipeline. The ANTI-DETECT-MODE provides an easy way for non-HLSL programmers to test if special optimizations are in the drivers.

You can download ShaderMark here.

The software tests the following Shader techniques:

Shaders

  • (shader 1) ps1.1, ps1.4 and ps2.0 precision test (exponent + mantissa bits)
  • (shader 2) directional diffuse lightning
  • (shader 3) directional phong lightning
  • (shader 4) point phong lightning
  • (shader 5) spot phong lightning
  • (shader 6) directional anisotropic lighting
  • (shader 7) fresnel reflections
  • (shader 8) BRDF-phong/anisotropic lighting
  • (shader 9) car paint shader (multiple layers)
  • (shader 10) environment mapping
  • (shader 11) bump environment mapping
  • (shader 12) bump mapping with phong lighting
  • (shader 13) self shadowing bump mapping with phong lighting
  • (shader 14) procedural stone shader
  • (shader 15) procedural wood shader
  • (shader 16) procedural tile shader
  • (shader 17) fur shader (shells+fins)
  • (shader 18) refraction and reflection shader with phong lighting
  • (shader 19) dual layer shadow map with 3x3 bilinear percentage closer filter
  • glare effect shader with ghosting and blue shift (HDR)
    • glare types: (shader 20) cross and (shader 21) gaussian
  • non photorealistic rendering (NPR) 2 different shaders
    • (shader 22) ollutline rendering + hatching
      • two simultaneous render targets (edge detection through normals + tex ID and regular image) or two pass version
      • per pixel hatching with 6 hatching textures
    • (shader 23) water colour like rendering
      • summed area tables (SAT)

To understand this test, I first better explain Vertex and Pixel Lighting, Vertex and pixel shader programming allows graphics and game developers to create photorealistic graphics on the PC. And with DirectX, programmers have access to an assembly language interface to the transformation and lighting hardware (vertex shaders) and the pixel pipeline (pixel shaders).

All real-time 3D graphics are built from component triangles. Each of the three vertices of every triangle contains information about its position, color, lighting, texture, and other parameters. This information is used to construct the scene. The lighting effects used in 3D graphics have a large effect on the quality, realism, and complexity of the graphics, and the amount of computing power used to produce them. It is possible to generate lighting effects in a dynamic, as-you-watch manner.

The Achilles heel of the entire GeForce FX series in the NV3x series remains pure DirectX 9 Shader performance. Basically Pixelshader and Vertexshader 2.0 turns out to be a bottleneck in games who massively utilize them. It has to do with precision, DX9 specified 24bit, ATI is using that. NVIDIA went for 16 and 32 bit precision which has a huge impact on performance.

Let's look at pure precision performance of DX9 shaders. The table below shows the GeForce FX 5900 and 5950 Ultra with Detonator drivers 52.16 compared to the Radeon 9800 Pro with Catalyst 3.8 drivers.

ShaderMark 2.0 5700U 5900 5900 p.p. 5950U 9800 Pro
Shader 1 0 0 0 0 0
Shader 2 67 117 137 140 232
Shader 3 44 75 98 90 160
Shader 4 0 0 0 0 163
Shader 5 38 65 79 77 130
Shader 6 42 72 109 87 170
Shader 7 39 68 93 81 147
Shader 8 0 0 0 0 127
Shader 9 19 34 54 40 116
Shader 10 83 145 146 173 238
Shader 11 68 116 125 139 206
Shader 12 35 60 78 72 139
Shader 13 21 35 56 42 87
Shader 14 27 46 70 56 86
Shader 15 26 45 75 54 125
Shader 16 21 36 53 42 76
Shader 17 3 5 9 6 13
Shader 18 13 26 48 32 105
Shader 19 0 0 0 0 27
Shader 20 0 0 0 0 50
Shader 21 0 0 0 0 54
Shader 22 0 0 0 0 36
Shader 23 0 0 0 0 52

The chart above displays full precision and next to that Partial precision (p.p.) for the FX 5900. With recent builds of Detonator drivers, NVIDIA definitely found something in the drivers to boost Shader 2.0 performance as they enhanced and optimized Vertex and PixelShader with the help of a real-time compiler (from what we heard). Fact is this, performance is still nowhere near the competition. I'm hoping that in the upcoming months we'll see even better Shader performance.

This is what NVIDIA basically is thinking about Shaders:

NVIDIA's  current line of cards based on the GeForce FX architecture attempts to maximize high-end graphics performance by supporting both 16-bit and 32-bit per-color-channel shaders--most DirectX 9-based games use a combination of 16-bit and 32-bit calculations, since the former provides speed at the cost of inflexibility, while the latter provides a greater level of programming control at the cost of processing cycles. The panel went on to explain that 24-bit calculations, such as those used by the Radeon 9800's pixel shaders, often aren't enough for more-complex calculations, which can require 32-bit math.

The GeForce FX architecture favors long shaders and textures interleaved in pairs, while the Radeon 9800 architecture favors short shaders and textures in blocks.

Another path NVIDIA or developers can take is using partial precision as named above. Basically the API will use both 32 and 16 bit color precision (partial precision) which is a compromise yet performance wise this might be the best thing to opt. In all honesty, I doubt if NVIDIA can get Shader 2.0 performance up to a level that can be compared with the competition within it's current generation. Luckily for them there are not too many pure DX9 titles out as the NV3x range will definitely suffer from this issue. I also fail to see why there still are several DX9 Shader operation not functioning while the competition clearly can do it.

One thing is a fact though, NVIDIA really did improve overall shader performance with the new detonator drivers.

Enough Shader talk for one page, let's do the conclusion.

Share this content
Twitter Facebook Reddit WhatsApp Email Print