NVIDIA H200 NVL
Hopper
- Memory
- 141 GB HBM3e
- Bandwidth
- 4.8 TB/s
- FP4 inference
- — (FP8 instead — 3,958 TFLOPS sparse)
- NVLink
- NVLink 4 (900 GB/s)
- Shipping
- Volume — 2024 onward
Available in
H200 (Hopper), B200 (Blackwell), B300 (Blackwell Ultra). Three price points, three shipping windows, three sweet spots. This is how we frame the choice for clients.
You need 70B-class inference on a budget, you want to ship in 4–8 weeks, and your workload doesn't push the GPU memory ceiling. Best price/performance per token in 2025.
You're refreshing in 2025–2026 and want the FP4 economics. Allocation is easier than B300, and you can step up to B300 later in the same DGX SuperPOD fabric.
Reasoning workloads with long context dominate your roadmap, OR you specifically need GB300 NVL72 for rack-scale. Allocation conversation now; expect 2026 ship windows.
Tell us about your workload. We reply within one business day with a quote sized to fit.