CUDA and ROCm work under Pytorch. If ROCm does not work well Pytorch does not wo...

Palmik · on June 29, 2024

It has fp8 support. Not sure whether fp8 on MI300x is supported by vLLM yet.

Also, many of these comparisons use vLLM for both setups, but for Nvidia you can and should use TensorRT-LLM which tends to do quite a bit better than vLLM at high loads.

latchkey · on June 29, 2024

Elio, the person who did the testing confirmed with me that he has fp8 working.

cromulen · on June 29, 2024

I haven't ever seen a server with 8x H100 NVL 188GB. The H100 NVL has 94GB of VRAM but they sell them in pairs connected with NVLink, so I guess they sometimes market them as 188GB, but in fact it's two cards and a server usually has 4 pairs.

latchkey · on June 29, 2024

> MI300X is 7 months.

Less than that, we paid for ours in January and received it in March. The first batch had problems and we had to send them back, which took another 3 weeks. So, let's consider the start date closer to April.

~3 months.

teaearlgraycold · on June 29, 2024

I thought H100 NVL had 96GB