Jump to content

Template:NvidiaDgxAccelerators

From Wikipedia, the free encyclopedia

Starting from P100,[1][2][3] to V100,[4] to A100,[5] to H100,[6] to B200[7][8] and to R100;[9] the comparison of accelerators used in DGX:

General & Architecture

[edit]
Model Architecture Socket GPU Fabrication Process Transistor count

(billion)

Die size

(mm2)

Launched
P100 Pascal SXM/SXM2 GP100 TSMC 16FF+ 15.3 610 Q2 2016
V100 16GB Volta SXM2 GV100 TSMC 12FFN 21.1 815 Q3 2017
V100 32GB SXM3
A100 40GB Ampere SXM4 GA100 TSMC N7 54.2 826 Q1 2020
A100 80GB Q4 2020
H100 Hopper SXM5 GH100 TSMC 4N 80 814 Q3 2022
H200 Q3 2023
B100 Blackwell SXM6 GB100 TSMC 4NP 208 N/A Q4 2024
B200
R100 Rubin SXM7 N/a TSMC 3N 338 N/a H2 2026

Cores, Clock & Power

[edit]
Model Boost clock

(MHz)

#SM Cores

(FP32 CUDA)

Cores

(FP64 excl. tensor)

Cores

(Mixed INT32/FP32)

Cores

(INT32)

TDP

(W)

P100 1480 56 3584 1792 N/a N/a 300
V100 16GB 1530 80 5120 2560 N/A 5120 300
V100 32GB 350
A100 40GB 1410 108 6912 3456 6912 N/A 400
A100 80GB
H100 1980 132 16896 4608 16896 N/A 700
H200 1000
B100 N/a N/a N/a N/a N/a N/a 700
B200 N/a N/a N/a N/a N/a N/a 1000
R100 N/a N/a N/a N/a N/a N/a 2300

Memory & Cache

[edit]
Model Memory Type

(HBM)

VRAM Size

(GB)

Memory Speed

(Gb/s)

Bus width

(bits)

Bandwidth

(TB/s)

L1 Cache

Per SM (KB)

L1 Cache

Total (KB)

L2 Cache

(KB)

P100 HBM2 16 1.4 4096 0.72 24 1344 4096
V100 16GB HBM2 16 1.75 4096 0.9 128 10240 6144
V100 32GB 32
A100 40GB HBM2 40 2.4 5120 1.52 192 20736 40960
A100 80GB HBM2e 80 3.2
H100 HBM3 80 5.2 5120 3.35 192 25344 51200
H200 HBM3e 141 6.3 6144 4.8
B100 HBM3e 192 8 8192 8 N/A N/A N/A
B200
R100 HBM4 N/a N/a N/a N/a N/a N/a N/a

Compute Performance, Interconnect & Networking

[edit]
Model FP32

(TFLOPS)

FP64

(TFLOPS)

INT8

dense tensor

FP16

dense tensor

bfloat16

dense tensor

TF32

dense tensor

FP64

dense tensor

Interconnect

(NVLink; TB/s)

Networking
P100 10.6 5.3 N/a 21.2 N/a N/a N/a 0.16 ConnectX-4

(100 Gb/s)

V100 16GB 15.7 7.8 N/A 125 TFLOPS N/A N/A N/A 0.3 ConnectX-5

(100 Gb/s)

V100 32GB
A100 40GB 19.5 9.7 624 TOPS 312 TFLOPS 312 TFLOPS 156 TFLOPS 19.5 TFLOPS 0.6 ConnectX-6

(200 Gb/s)

A100 80GB
H100 67 34 1.98 POPS 990 TFLOPS 990 TFLOPS 495 TFLOPS 67 TFLOPS 0.9 ConnectX-7

(400 Gb/s)

H200
B100 N/a N/a 3.5 POPS 1.98 PFLOPS 1.98 PFLOPS 989 TFLOPS 30 TFLOPS 1.8 ConnectX-7

(400 Gb/s)

B200 N/a N/a 4.5 POPS 2.25 PFLOPS 2.25 PFLOPS 1.2 PFLOPS 40 TFLOPS
R100 N/a N/a N/a N/a N/a N/a N/a N/a ConnectX-9

(1600 Gb/s)

  1. ^ "NVIDIA Tesla P100". Nvidia.
  2. ^ "NVIDIA Tesla P100 SXM2". TechPowerUp.
  3. ^ "NVIDIA Tesla P100 PCIe 16 GB". TechPowerUp.
  4. ^ Garreffa, Anthony (September 17, 2017). "NVIDIA Tesla V100 Tested: Near Unbelievable GPU Power". TweakTown.com. Retrieved December 30, 2025.
  5. ^ Smith, Ryan (May 14, 2020). "NVIDIA Ampere Unleashed: NVIDIA Announces New GPU Architecture, A100 GPU, and Accelerator". AnandTech. Archived from the original on July 29, 2024.
  6. ^ Smith, Ryan (March 22, 2022). "NVIDIA Hopper GPU Architecture and H100 Accelerator Announced: Working Smarter and Harder". AnandTech. Archived from the original on September 23, 2023.
  7. ^ "B100 vs B200: Which NVIDIA blackwell GPU is right for your AI workloads? | Blog — Northflank". Northflank — Deploy any project in seconds, in our cloud or yours. Retrieved 2026-06-15.
  8. ^ "Comparing Blackwell vs Hopper | B200 & B100 vs H200 & H100 | Exxact Blog". www.exxactcorp.com. Retrieved 2026-06-15.
  9. ^ Mitrasish; Co-founder; CTO; Spheron. "NVIDIA Rubin R100 GPU Chip Specs: Architecture, VRAM, and Cloud Availability (2026) | Spheron Blog". Spheron. Retrieved 2026-06-13.