Maybe they should have called it DeepFake, or DeepState, or better still Deep Selloff. Or maybe the other obvious deep thing ...
Against the H100, the B200 managed 2.2x higher performance when fine-tuning Llama 2 70B and twice the performance when pre-training GPT-3 175B. However, it's not just raw FLOPS at play here.