Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
Netac NV7000 2 TB NVMe SSD Review
ASUS GeForce RTX 4080 Noctua OC Edition review
MSI Clutch GM51 Wireless mouse review
ASUS ROG STRIX B760-F Gaming WIFI review
Asus ROG Harpe Ace Aim Lab Edition mouse review
SteelSeries Arctis Nova Pro Headset review
Ryzen 7800X3D preview - 7950X3D One CCD Disabled
MSI VIGOR GK71 SONIC Blue keyboard review
AMD Ryzen 9 7950X3D processor review
FSP Hydro G Pro 1000W (ATX 3.0, 1000W PSU) review

New Downloads
AMD Radeon Software Adrenalin 23.3.2 WHQL download
Intel ARC graphics Driver Download Version: 31.0.101.4148
GeForce 531.29 WHQL driver download
CrystalDiskInfo 9.0.0 Beta3 Download
AMD Ryzen Master Utility Download 2.10.2.2367
AMD Radeon Software Adrenalin 23.3.1 WHQL download
Display Driver Uninstaller Download version 18.0.6.1
CPU-Z download v2.05
AMD Chipset Drivers Download 5.02.19.2221
GeForce 531.18 WHQL driver download


New Forum Topics
MSI MPG X570s Carbon EK X- Support new AM4 Ryzen 7000X3D series CPU? Raja Koduri, Chief Architect of Intel's GPU Division, Leaves Intel Valve Launches Counter-Strike 2 Limited Test for Selected Players: How to Participate MSI Afterburner and Unwinder Phison CEO Predicts Slow Adoption of PCIe 5 SSDs Until Second Half of 2024 AMD Software: Adrenalin Edition 23.3.2 WHQL - Driver Download and Discussion MSI Unleashes SPATIUM M570 PCIe 5.0 NVMe M.2 HS SSD at 10,000 MB/s sequential read & write speeds New DLSS DLL 2.3.9 shows little to no ghosting?! On April 11th, CD Projekt RED to Launch RTX Path Tracing Overdrive Mode for Cyberpunk 2077 Gainward Announces New RTX 40 Panther Series Graphics Cards




Guru3D.com » Review » Palit GeForce RTX 4080 GamingPRO OC review » Page 5

Palit GeForce RTX 4080 GamingPRO OC review - GPU Architecture

by Hilbert Hagedoorn on: 01/20/2023 03:39 PM [ 4] 7 comment(s)

Tweet

Digging deeper

The maximum per GPU shader cluster (Nvidia SM) for the Geforce RTX 4000 is now 144, which is the theoretical maximum. . As with Ampere, a cluster has 64 FP32 units and 64 FP32/INT32 units, four texture units, four tensor cores (Gen 4), a ray tracing core (Gen 3), and 128 KiB of L1 cache. This is how 18,432 FP32 shaders are assembled in one fully enabled ADA102 GPU. Half of which compute entirely is FP32 and the other half calculate in either FP32 or INT32. The configuration of the units relative to one another is identical to that of the Ampere; Nvidia has not altered this quantity. The raster operations pipeline of 16 units per raster engine also remains the same seen from Ampere. The Ada SM is equipped with 128 KB of Level 1 cache. Depending on the workload, this cache has a unified architecture that may be configured to operate as either an L1 data cache or shared memory. The complete AD102 GPU includes 18432 KB of L1 cache memory (compared to 10752 KB in GA102). Ada's Level 2 cache has been substantially redesigned relative to Ampere. AD102 is equipped with 98304 KB of L2 cache, a 16-fold increase over GA102's 6144 KB of L2 cache. All programs will benefit from the availability of such a vast cache memory pool, but sophisticated procedures such as ray tracing (especially path tracing) will gain the most.

A full AD103 GPU includes:

  • 9728 CUDA Cores
  • 76 RT Cores
  • 304 Tensor Cores
  • 304 Texture Units
  • 112 ROPs
 

Geforce RTX 4000 block diagram by Ada Lovelace

Based on that alone the GeForce RTX 4080 is substantially quicker than the RTX 3080 Ti.  The RTX 4080 16GB has 9,728 shaders, and the RTX 4080 12GB has 7,680 shaders, but all announced variants have clock rates between 2.5 and 2.6 GHz boost.

NVIDIA GeForce RTX 4090

The new flagship GPU is the AD102-300, with 16384 CUDA cores and a boost clock of up to 2520 MHz and 23-power phases. This translates to a single-precision performance of 82.6 TFLOPS, 2.3x higher than its predecessor, the RTX 3090. The flagship Ada-based SKU will include 24GB of GDDR6X memory, with a peak bandwidth of 1 TB/s. This new card will require at least 100W more power than the 3090 Ti. NVIDIA confirms that this model will be available on October 12 for $1599.

 

NVIDIA GeForce RTX 4080 

The RTX 4080 16GB will have an AD103 GPU with 9728 CUDA cores, 16GB of GDDR6X memory clocked at 22.4 Gbps, and a 320W TDP. This model will be available in November for at least $1199. 

DLSS3

The NVIDIA Applied Deep Learning Research team has spent the past four years developing a frame generation technique that blends optical flow estimates with DLSS to enhance the gaming experience. The insertion of synthesized frames between existing frames enhances the frame rate and delivers a more fluid gaming experience. Optical flow estimation is frequently used in computer vision applications to measure the direction and amplitude of pixels' apparent motion between successively generated graphics frames or video frames. In the realms of 3D graphics and video, typical use cases have included minimizing latency in augmented and virtual reality, enhancing the smoothness of video playback, improving video compression efficiency, and stabilizing video cameras. Typical applications of deep learning include automobile and robotic navigation, video analysis and comprehension. Optical flow is comparable to the motion estimation component of video encoding, but its requirements for precision and consistency are significantly more demanding. As a result, many algorithms are employed. Since the Ampere GPU architecture, NVIDIA's GPUs have supported an optical flow engine (OFA) that employs cutting-edge algorithms to produce high-quality outputs. Ada's OFA unit delivers 300 TeraOPS (TOPS) of optical flow work (over 2x quicker than the Ampere generation OFA) and supplies essential data to the DLSS 3 network. The Ada OFA unit and new motion vector analysis algorithms are essential components that enable accurate and efficient frame production inside the new DLSS 3 technology architecture. This new DL-based frame generation algorithm increases frame rates by a factor of two in comparison to DLSS 2. When DLSS 3 is paired with the new RT Core and other Ada architecture improvements, Ada GPUs are up to four times quicker than their predecessors. DLSS 3 can also enhance performance when the CPU is the GPU's performance barrier. Microsoft Flight Simulator is a typical example of a CPU-limited game because of its physics and enormous draw distances. This reduces the performance advantages of conventional super-resolution systems. In this instance, though, DLSS 3's capacity to produce frames still delivers a performance boost of up to double.




32 pages « < 4 5 6 7 next »



Related Articles
Palit GeForce RTX 4080 GamingPRO OC review
The Palit GeForce RTX 4080 GamingPRO OC is a powerful graphics card from the ADA Lovelace generation. It has been improved with a higher TGP, 16 GB of graphics memory, and luxurious triple fan cooling...

Palit GeForce RTX 4080 GameRock OC review
We will glance at the Palit GeForce RTX 4080 GameRock OC, a robust graphics card from the ADA Lovelace generation. It has been upgraded with higher TGP, more memory for its graphics processor (16 GB),...

Palit GeForce GTX 1630 4GB Dual review
NVIDIA has released a budget series graphics card, don't expect flying framerates, but a fun little card for entry-level gaming. Meet the GeForce GTX 1630 4GB from Palit, in a DUAL fan version....

Palit GeForce RTX 3090 Ti GameRock OC review
Palit offers the GeForce RTX 3090 GameRock OC edition graphics card, which we review today. The 'Ti' edition of the GeForce RTX 3090 has been in the works for quite some time. The flagship series h...

© 2023