Executive TL;DR:
- AMD MI355X achieves 2626 tok/s/node at over 2x lower cost than Blackwell
- Performance per watt is a key metric for data center decisions outside the US
- Accuracy degradation is a concern with quantization to FP4
The Buzz Score
The Internet’s Verdict: 70% Hyped, 30% Skeptical
Forum Insights
When comparing AMD to Nvidia, one user notes:
Can you folks add performance per watt as a metric to these comparisons, I honestly want to understand where AMD fits in the stack in terms of actual performance to dollars.
However, not all are convinced, with one user stating:
quantization to FP4 is practically never lossless in actual use. A lot of providers are advertising high TPS on Kimi and GLM, but the models are functionally lobotomized and no longer close to frontier quality.
Another user highlights the need for transparency:
I think we should make it illegal to not specify the quantization in the headline for these types of posts.
Lastly, one user shares their experience:
There’s noticeable accuracy degradation when they switched from fp8 to mxfp4
Focus Keyword: AMD GLM5.2