Microsoft DirectX 8.0 was the first DirectX version to enable programmable vertex shading hardware while adding significant flexibility to advanced pixel shading hardware. It was new hot and something we slowly start to see in games. However, DX8 and DX8 hardware set many restrictions on developers. Microsoft DX9 frees developers from those restrictions. With the most complete DX9 implementation in hardware, the GeForce FX GPUs deliver the performance and power required to overcome three major DX8 limitations:
Pixel shader programmability: The DX8 pixel shader specifications could appropriately be termed configurable rather than programmable because the DX8 spec needed the general programmability associated with a full instruction set and flexible programming structure. The GeForce FX GPUs and DX9 allow much longer shader programs, and give developers a greatly expanded pixel shading instruction set. DX9 exposes true programmability of the pixel shading engine. This makes procedural shading on a GPU possible for the first time.
Vertex shader programmability: The DX8 vertex shader specification gave developers very little control over program flow. DX8 vertex shader programs are executed linearly, with no early termination for performance optimization. DX9 and the GeForce FX GPUs support conditional branching for greatly improved program flow control. DX9 dramatically enhances the power of the previous DirectX vertex shader by increasing the length and flexibility of vertex
Unified pixel shading specifications: The addition of Pixel Shader 1.4 to the DX8 specification created an alternate pixel shader structure and programming structure that followed a completely different implementation philosophy. Many developers were forced to re-write shader programs for different hardware, rather than writing strictly to the API. The DX9 specification eliminates this problem with the new unified Pixel Shader 2.0 definition.
The combination of DX9 and GeForce FX GPUs narrows the gap between state of- the-art PC graphics and state-of-the-art rendered movies such as the Disney/Pixar Monsters, Inc. and Columbia Pictures Final Fantasy: The Spirit Within productions. The DX9 API is a critical enabling technology for this powerful new combination of hardware and software.
High-precision, floating-point color: DX9 breaks the mathematical precision barrier that has limited PC graphics in the past. Precision, and therefore visual quality, is increased with 128-bit floating-point color per pixel.
The versatility and power of the GeForce FX architecture combined with DX9 allow new digital worlds with breathtaking visual effects that were simply not achievable with DX8. Through its programmability and higher precision, DX9 brings sophisticated effects into the world of real-time graphics for the first time ever. Plants, metallic paint, skin, and eyes are great examples of objects that can be visually stunning with this new platform, awesome.
As an example of the performance enhancing capabilities of DX9, try to remember the Wolfman character in one of the NVIDIA GeForce4 Ti family demos (see image below).
The Wolfman character Using DX8-style rendering techniques, the fur for Wolfman required eight passes for every pixel. The technique involved modeling the fur as several layers of geometry, and stepping through each layer to calculate its contribution to the current pixel. Using DX9, GeForce FX GPU is capable of rendering the Wolfmans fur in one single pass. Collapsing the rendering from eight passes down to a single pass of course will bring huge performance gains, without compromising image quality.
DirectX 9 programmability
DirectX 9 offers a richer programming language with more commands, as well as a longer and more flexible program structure. The image below summarizes the fundamental limits for DX8, DX9, and the special capability bits (also known as cap bits) that extend the DX9 shader specifications to take better advantage of the capabilities inherent to the GeForce FX architecture. Note that the move from DX8 to DX9 results in an increase of various resources such as constants and programming registers to store temporary data. Longer programs will lead to more sophisticated programming, which in turn requires more of these resources.
Studio Quality precision
Since the new GeForce FX gains the same high standards of precision used by the film industry today. The inherent 16- and 32-bit floating point formats (FP16 and FP32) of the GeForce FX GPUs give developers of game software the ability to create the high quality graphics. FP32 offers the ultimate image quality, delivering true 128-bit color and that my friends is the same level of precision used in the film industry today to achieve awsome color effects. Studio quality color requires both processing and storing pixels in this 128-bit mode.
Solutions that do not fully support a 128-bit pixel processing pipeline, such as platforms that only support FP24 format, will not be able to match the studio-quality color of the GeForce FX GPUs. NVIDIA GeForce FX platforms also support FP16 for optimizing performance when full 128-bit color is not required. Developers are free to move back and forth between these formats in their application, using the format that is best suited to a particular computation. For instance, some actions such as indexing into a high-resolution texture can only be optimally accomplished using a 32-bit floating-points format. If the texture is larger than 1024 x 1024 (210 x 210, requiring at least 10 mantissa bits per texture coordinate), the developer needs FP32 to access all of the data. Other computations can be accurately accomplished using FP16, and can benefit from the maximized execution speed afforded by this level of precision. The accuracy levels and choice afforded by these higher-precision formats make it possible for developers to produce cinematic graphics in real time and optimize the performance in every situation.
The advent of 32-bit floating-point pixel precision makes it possible to create high-quality images. Efficient volumetric effectsground fog, spherical fog, sprites that fade out smoothly rather than getting clipped by world geometrycan be based on buffering and reusing per-pixel 32-bit floating-point z. Precise per-pixel lighting attenuation formulas can be based on passing in light-source positions in 32-bit floating-point vector registers. And, much higher-quality bump-mapping is possible with 16- and 32-bit floating point. With just 8 bits per component, there were noticeable artifacts and un-normalized bump maps.
Tim Sweeney, Epic Games, Inc
Intelli Sample Technology
The NVIDIA Intellisample technology and other features of the new NVIDIA GeForce FX GPUs improve performance and achieve new levels of realism. Some or all of the described features as appropriate for the target markets. Achieving both richness of functionality and optimal performance creates many challenges at the architectural level. To meet these challenges, NVIDIA GeForce FX solutions support high frame rates and the highest levels of image quality with sophisticated compression, anisotropic, and trilinear filtering capabilities. The latest NVIDIA advances include the following technical breakthroughs, taking image quality and performance to the next level.
The new NVIDIA architecture includes a new hardware implemented color compression technology. NVIDIA GeForce FX processors employ an advanced proprietary form of loss-less data compression with a 4:1 compression ratio for color information. This industry-unique color compression solution is implemented in hardware and is completely transparent to applications, with both compression and decompression taking place in real time. Because this compression is completely loss-less, there is no reduction in image quality or loss of precision. The result of this NVIDIA technology is a big increase in memory efficiency, overall improved system performance, and good image quality.
When we translate this to you, the consumer .. this means that antialiasing speed is improved to the point where essentially all modes of antialiasing are free without any associated loss of performance.
Fast Clear of the Color Buffer
The NVIDIA GeForce FX GPUs include fast color clears that are executed in hardware. By accelerating this common operation, overall graphics performance is improved.
Dynamic Gamma Correction
Contrary to what a computer monitor shows or how the human eye perceives it, computer graphics programs generally assume a linear color space. And, with most artwork created in some form of gamma space, operations done in the shaders should take place in gamma space as well, but this is not really convenient. Instead, the correct approach should bring color values back into linear space, perform the math, and corrects back into gamma space. Many previous shaders did not take gamma into consideration, resulting in color inaccuracies. However, with built-in gamma correction capabilities, NVIDIA GPUs free developers from the burden of changing gamma spaces. Users see a truer representation of each rendered images luminance (or brightness).
Right picture is using gamma correction
Adaptive Texture Filtering
The NVIDIA GeForce FX GPU offers a variety of adaptive texture filtering techniques that offer PC users more options to improve their image quality without forcing them to compromise on frame rate to get it. These adaptive techniques require the hardware to monitor both the geometry and the texture content continuously to make intelligent trade-offs that enhance performance but wont produce visual artifacts. If the user chooses to enable these techniques, the hardware will automatically adjust the number and type of samples it takes for texturing operations on a pixel-by-pixel basis. These algorithms are capable of intelligently selecting texel and filter levels for trilinear as well as anisotropic filtering modes. Alternatively, the user can choose to use the most conservative settings and know that all of the texture filtering is done using traditional algorithms. As part of NVIDIAs Intellisample technology, adaptive texture filtering raises the bar for giving users more choices to get higher image quality without compromising fluid frame rates that gamers crave.
New Antialiasing Modes
The NVIDIA GeForce FX GPUs support a new 6XS mode under Direct3D and new 8X modes for both OGL and Direct3D. These modes, either enabled in the latest Microsoft DirectX titles, or available through the control panel settings, provide a higher level of quality than 4X or 4XS antialiasing. By calculating 1.5X as many samples as 4X AA, 6XS takes image quality higher than any 4-sample solution can. Additionally, the new 8X modes provide the highest image quality by calculating twice the number of samples as 4X modes calculate. These new 8X modes are clearly the choice for top antialiasing quality. All of these choices empower PC users to fine-tune their display settings to fit their applications and style of computing: as a result, they get fluid frame rates for intense gaming action and great image quality too.