Guru3D.com
  • HOME
  • NEWS
    • Channels
    • Archive
  • DOWNLOADS
    • New Downloads
    • Categories
    • Archive
  • GAME REVIEWS
  • ARTICLES
    • Rig of the Month
    • Join ROTM
    • PC Buyers Guide
    • Guru3D VGA Charts
    • Editorials
    • Dated content
  • HARDWARE REVIEWS
    • Videocards
    • Processors
    • Audio
    • Motherboards
    • Memory and Flash
    • SSD Storage
    • Chassis
    • Media Players
    • Power Supply
    • Laptop and Mobile
    • Smartphone
    • Networking
    • Keyboard Mouse
    • Cooling
    • Search articles
    • Knowledgebase
    • More Categories
  • FORUMS
  • NEWSLETTER
  • CONTACT

New Reviews
ASUS ROG Rapture GT-AXE11000 WIFI6E router review
Backforce One Plus Gaming Chair review
ASUS GeForce RTX 3080 Noctua OC review
AMD Ryzen 5 5600 review
PowerColor RX 6650 XT Hellhound White review
FSP Hydro PTM Pro (1200W PSU) review
ASUS ROG Radeon RX 6750 XT STRIX review
AMD FidelityFX Super Resolution 2.0 - preview
Sapphire Radeon RX 6650 XT Nitro+ review
Sapphire Radeon RX 6950 XT Sapphire Nitro+ Pure review

New Downloads
Corsair Utility Engine Download (iCUE) Download v4.24.193
Intel HD graphics Driver Download Version: 30.0.101.1994
GeForce 512.95 WHQL driver download
AMD Radeon Software Adrenalin 22.5.2 driver download
AIDA64 Download Version 6.70
FurMark Download v1.30
Display Driver Uninstaller Download version 18.0.5.1
Download Samsung Magician v7.1.1.820
Intel ARC graphics Driver Download Version: 30.0.101.1732
HWiNFO Download v7.24


New Forum Topics
Review: AMD Ryzen 5 5600 processor Info Zone - gEngines, Ray Tracing, DLSS, DLAA, TSR, FSR, XeSS, DLDSR etc. Nvidia Shadercache setting. 512.95 - Clean Version Top Gun: Maverick add-on for Microsoft Flight Simulator is now available for free. [3rd-Party Driver] Amernime Zone Radeon Insight 22.5.1 WHQL Driver Pack (Released) A 500Hz refresh rate NVIDIA G-Sync compatible gaming LCD is in the works AMD changes its statements, Ryzen 7000 will consume up to 230W ctrl-f does not work in MSI AFTERBURNER Should I use the latest stable release of "Nvidia Profile Inspector" or the latest pre-release?




Guru3D.com » Review » GeForce GTX 1050 3GB review » Page 4

GeForce GTX 1050 3GB review - Pascal GPU Architecture

by Hilbert Hagedoorn on: 07/06/2018 12:17 PM [ 4] 32 comment(s)

Tweet

The Pascal GP107 GPU

The GPU is based on a DX12 compatible architecture called Pascal. Much like in the past designs you will see pre-modeled SMX clusters that hold what is 2x64 shader processors per cluster. Pascal GPUs are composed of different configurations of Graphics Processing Clusters (GPCs), Streaming Multiprocessors (SMs), and memory controllers. Each SM is paired with a PolyMorph Engine that handles vertex fetch, tessellation, viewport transformation, vertex attribute setup, and perspective correction. The GP107 PolyMorph Engine also includes new Simultaneous Multi-Projection units. There are 6 active (SM) clusters for a fully enabled Pascal GP107 GPU. The GeForce GTX 1050 Ti is fully enabled, the non-Ti model 1050 thus has one cut out thus holding 5 SM clusters.

  • The GeForce GTX 1050  (GP107-300) has 5 x 128 shader processors which make a total of 640 shader processors.
  • The GeForce GTX 1050 3GB  (GP107-301) has 6 x 128 shader processors which make a total of 768 shader processors.
  • The GeForce GTX 1050Ti  (GP107-400) has 6 x 128 shader processors which make a total of 768 shader processors.
  • The GeForce GTX 1060 (3GB)  (GP104-400) has 9 x 128 shader processors which make a total of 1,152 shader processors.
  • The GeForce GTX 1060 (6GB) (GP104-400) has 10 x 128 shader processors which make a total of 1,280 shader processors.
  • The GeForce GTX 1070 (GP104-200) has 15 x 128 shader processors which make a total of 1,920 shader processors.
  • The GeForce GTX 1080 (GP104-400) has 20 x 128 shader processors which make a total of 2,560 shader processors.

Each SM, however, has a cluster of 64 shader / stream / cuda processors doubled up. Don't let that confuse you, it is 128 shader units per SM. Each GPC ships with a dedicated raster engine and five SMs. Each SM contains 128 CUDA cores, 256 KB of register file capacity, a 96 KB shared memory unit, 48 KB of total L1 cache storage, and eight texture units. As far as the memory specs of the GP107 GPU are concerned, these boards will feature a 96 or 128-bit memory bus connected to 2,3 or 4 GB of GDDR5 video memory, AKA VRAM AKA framebuffer AKA graphics memory for the graphics card. The GeForce GTX 1000 series are DirectX 12 ready, in our testing, we'll address some Async compute tests as well. The latest revision of DX12 is a Windows 10 feature only but can bring in significant optimizations. For your reference, below is a quick overview of some past generation high-end GeForce cards. With 4 GB graphics memory available for one GPU, the GTX 1050 Ti is very attractive for entry level modern and future games no matter what resolution you game at. 1080P and 4 GB are fine.

Pascal Graphics Architecture

Let's place the more important data on the GPU into a chart to get an idea and better overview of changes in terms of architecture like shaders, ROPs and where we are at frequencies wise:

 

 

  
GeForceTitan X GTX 1080GTX 1070GTX 1060GTX 1050 TiGTX 1050
  (2016 edition)        
GPU GP102-400-A1 GP104-400-A1 GP104-200-A1 GP106-400-A1 GP107-400-A1 GP107-30x-A1
Architecture Pascal Pascal Pascal Pascal Pascal Pascal
Transistor count 12 Billion 7.2 Billion 7.2 Billion 4.4 Billion 3.3 Billion 3.3 Billion
Fabrication Node 16 nm 16 nm 16 nm 16 nm 14 nm 14 nm
CUDA Cores 3,584 2,560 1,920 1,280 768 768/640
SMMs / SMXs 28 20 15 10 6 6/5
ROPs 96 64 64 48 32 24/32
GPU Clock Core 1,417 MHz 1,607 MHz 1,506 MHz 1,506 MHz 1,290 1,354 MHz
GPU Boost clock 1,531 MHz 1,733 MHz 1,683 MHz 1,709 MHz 1,392 1,455 MHz
Memory Clock 2500 MHz 1,250 MHz 2,000 MHz 2,000 MHz 1,752 MHz 1,752 MHz
Memory Size 12 GB 8 GB 8 GB 6 GB 4 GB 3/2 GB
Memory Bus 384-bit 256-bit 256-bit 192-bit 128-bit 96/128-bit
Memory Bandwidth 480 GB/s 320 GB/s 256 GB/s 192 GB/s 112 GB/s 84/112 GB/s
FP Performance 11.0 TFLOPS 9.0 TFLOPS 6.45 TFLOPS 4.61 TFLOPS 2.2 TFLOPS 1.9 TFLOPS
GPU Thermal Threshold 94 Degrees C 94 Degrees C 94 Degrees C 94 Degrees C 97 Degrees C 97 Degrees C
TDP 250 Watts 180 Watts 150 Watts 120 Watts 75 Watts 75  Watts
Launch MSRP ref $1200 $599/$699 $379/$449 $249/$299 $139  $109

  

So we talked about the core clocks, specifications, and memory partitions. However, to be able to better understand a graphics processor you simply need to break it down into tiny pieces. Let's first look at the raw data that most of you can understand and grasp. This bit will be about the architecture. NVIDIA’s “Pascal” GPU architecture implements a number of architectural enhancements designed to extract even more performance and more power efficiency per watt consumed. Above, in the chart photo, we see the block diagram that visualizes the architecture, Nvidia started developing the Pascal architecture around 2013/2014 already. The GPCs has 6 SMX/SMM (streaming multi-processors) clusters in total. You'll spot six 32-bit memory interfaces, bringing in a 128-bit path to the graphics GDDR5 or GDDR5X memory. Tied to each 32-bit memory controller are eight ROP units and 256 KB of L2 cache. The full GP107 chip used in the GTX 1050 Ti thus has a total of 32 ROPs and 1,024 KB of L2 cache.

A fully enabled GP104 GPU will have (GTX 1080):

  • 2,560 CUDA/Shader/Stream processors
  • There are 128 CUDA cores (shader processors) per cluster (SM)
  • 7.1 Billion Transistors (FinFet at 16 nm)
  • 160 Texture units
  • 64 ROP units
  • 2 MB L2 cache
  • 256-bit GDDR5X

A partially disabled GP104 GPU will have (GTX 1070):

  • 1,920 CUDA/Shader/Stream processors
  • There are 128 CUDA cores (shader processors) per cluster (SM)
  • 7.1 Billion Transistors (FinFet at 16 nm)
  • 120 Texture units
  • 64 ROP units
  • 2 MB L2 cache
  • 256-bit GDDR5

A fully enabled GP106 GPU will have (GTX 1060):

  • 1,280 CUDA/Shader/Stream processors
  • There are 128 CUDA cores (shader processors) per cluster (SM)
  • 4 Billion Transistors (FinFet at 16 nm)
  • 80 Texture units
  • 48 ROP units
  • 2 MB L2 cache
  • 192-bit GDDR5

A fully enabled GP107 GPU will have (GTX 1050 (Ti)):

  • 768 CUDA/Shader/Stream processors
  • There are 128 CUDA cores (shader processors) per cluster (SM)
  • 3.3 Billion Transistors (FinFet at 16 nm)
  • 48 Texture units
  • 32 ROP units
  • 1 MB L2 cache
  • 128-bit GDDR5
In the pipeline we run into the ROP (Raster Operation) engine and the GP107 has 32 engines for features like pixel blending and AA. The GPU has 64 KB of L1 cache for each SMX plus a special 48 KB texture unit memory that can be utilized as a read-only cache. The GPU’s texture units are a valuable resource for compute programs with a need to sample or filter image data. The texture throughput then, each SMX unit contains 8 texture filtering units.
  • GeForce GTX 960 has 8 SMX x 8 Texture units = 64
  • GeForce GTX 970 has 13 SMX x 8 Texture units = 104
  • GeForce GTX 980 has 16 SMX x 8 Texture units = 128
  • GeForce GTX Titan X has 24 SMX x 8 Texture units = 192
  • GeForce GTX 1050 (2GB) has 5 SMX x 8 Texture units = 40
  • GeForce GTX 1050 (3GB) has 6 SMX x 8 Texture units = 48
  • GeForce GTX 1050 Ti (4GB) has 6 SMX x 8 Texture units = 48
  • GeForce GTX 1060 (3GB) has 9 SMX x 8 Texture units = 72
  • GeForce GTX 1060 (6GB) has 10 SMX x 8 Texture units = 80
  • GeForce GTX 1070 has 15 SMX x 8 Texture units = 120
  • GeForce GTX 1080 has 20 SMX x 8 Texture units = 160

So there's a total of 6 SMX x 8 TU = 48 texture filtering units available (Ti).




27 pages « 3 4 5 6 next »



Related Articles
ASUS GeForce RTX 3080 Noctua OC review
Enjoy the silence, since who doesn't remember that tune from the 1980s? Join us as we analyze the all new GeForce RTX 3080 Noctua OC model. You can dispute its appearance and style, but the card perf...

Gigabyte GeForce RTX 3090 Ti Gaming OC review
Gigabyte has released their GeForce RTX 3090 'Ti' Gaming OC. The new flagship was fitted with faster memory, a boost frequency of 1905 MHz, more shaders, and a TGP passing 450 Watts. This review ben...

ASUS GeForce RTX 3090 Ti TUF Gaming review
It's been boiling for a while, a GeForce RTX 3090 'Ti'. The 3090 flagship series now has quicker memory, more shaders, and a TGP of 450-500 Watts. In this review, we benchmark the GeForce RTX 309...

MSI GeForce RTX 3090 Ti SUPRIM X review
It's been brewing for a while now, a 'Ti' model of the GeForce RTX 3090. The flagship series is further improved, with faster memory, more shaders, and an increased TGP sitting in the 450-500 Watt ...

© 2022