NVIDIA GF100 (Fermi) Technology preview
Posted by Hilbert Hagedoorn on: 01/17/2010 02:00 PM [ 0 comment(s) ]
The PolyMorph engine - This unit was introduced to cope with the tremendous workload that the all new function, tesselation, brings into play. With tessellation the number of triangles in a scene can increase by multiple orders of magnitude and therefore NVIDIA had to come up with a solution to either build one massive tessellation unit, or break it down in several ones per shader cluster and go for efficiency.
If you look at the G80, on top of the pipeline you had all these little units separated right, well the PolyMorph engine is an accumulated cluster with the Vertex Fetcher, Viewport transform, it handles attribute setup and stream output (massive numbers of the same object repeated in a scene), and also now the Tessellation unit can be found in it, all merged together in this one little mini processor. And that is significant, each cluster of shader processors therefore has a tessellate unit hence the (what we can only assume) massive tessellation performance. So now then, each of the sixteen PolyMorph engines contain a vertex fetcher and tesselator, greatly expanding tessellation and (when sent out to the raster engine) rasterization performance. This was an expensive unit in terms of transistors to insert, we heard something like 10% of the design, but should reap the fruit of some hard labor.
One thing I'd like to add is that a lot of improvement has been made in the ROP side of things, the AA performance will go up significantly, in fact if I can sidetrack and relate directly to a game; take HAWX for example at 8xAA it will perform roughly 2.33 times faster than a GeForce GT 285.
So the engine allows to parallelize the workload and have a nicely scalable design which in the end also ensures better usage of caches. More on that later though. And we'll explain in one of the next chapters what tessellation exactly is, okay?
Let's talk about data caches for a minute, you guys might remember that GT200 all of a sudden had a shared Level 2 cache. It's the same for GF100 but now we also spot a L1 cache as well and that is going to help out massively on the compute side of the Shader processors.
GF100 Cache setup
| GT200 | GF100 | Result | |
| L1 Texture Cache (per quad) | 12 KB | 12 KB | Faster Texture Filtering |
| L1 LD/ST Cache dedicated | - | 16 or 48KB | Efficient physics & raytracing |
| Total Shared memory | 16KB | 16 or 48KB | More data reuse among threads |
| L2 Cache (shared) | 256KB | 768KB | Greater texture coverage, compute perf |
So the GF100 has a dedicated L1 cache per shader cluster, each SM has 64KB of on-chip memory which can be configured as 48KB of shared memory with 16KB of L1 cache, or 16KB of shared memory with 48KB of L1 cache.
Next to that the GF100 has a 768KB shared L2 cache allowing load, store and texture requests. This cache sits in-between all shader clusters and can be accessed by all of them. This unified read/write cache allows program correctness and is a key feature to support generic C/C++ programs.
So yes, the caches certainly look a whole lot better, and that is going to work out beneficial on many sides and segments.
Last week we arrived at Sin City not only to cover CES but there was something else going on as well. In Las Vegas, NVIDIA had organized a briefing for a select group of the press. From Europe perhaps ten to fifteen people where invited for this somewhat privileged preview -- the topic, a technical overview of project Fermi. Fermi is of course the family name of the latest generation of GPUs from NVIDIA. The first chipset deriving from Fermi will be called the GF100 GPU which will likely be used on what we think will be called products like GeForce 360 and GeForce 380. Join us in a nice technology preview.
NVIDIA GeForce 3D Vision review
In this article we will test and review the NVIDIA GeForce 3D Vision stereo kit. NVIDIA teamed up with Samsung to optionally bundle 120 Hz LCD monitors with their all new 3D stereo shutter glasses technology. NVIDIA on their end got driver support up and going to a state where it's really good. Next to that, they redesigned the approach to the overall gaming experience. A set of shutter glasses that is wireless and rechargeable, games that are supported in the new drivers will automatically kick in 3D mode and next to that, NVIDIA really wanted a cool looking kit.
NVIDIA GF9300 (ECS GF9300TA) mainboard review
A test on the ECS GF9300T-A motherboard. Today NVIDIA is introducing their more budget conscious mainboard chipsets. The GF9300 and GF9400 based integrated graphics chipset motherboard products.
NVIDIA GeForce 8800 Ultra review
Today is the day that NVIDIA is launching it's GeForce 8800 Ultra. Now, NVIDIA tried to keep this product as secret as can be ... why ? Two reasons, to prevent technical specifications leaking onto the web. Secondly; obviously to change specs at the last minute. See ATI is releasing their R600 graphics card soon and the Ultra is the product that NVIDIA prepared to counteract in the market, an allergic reaction tothe R600 so to speak.
