Executive TL;DR
- GLM-5.2 requires 24GB of VRAM and 256GB of RAM for MoE offloading.
- Running GLM-5.2 on local hardware is possible but needs high-end GPUs.
- Quantization analysis shows a 2.5% loss in token agreement.
The Internet’s Verdict: 70% Hyped, 30% Skeptical
Introduction to GLM-5.2
GLM-5.2 is a powerful model that can be run on local hardware. However, it requires significant resources.
Hardware Requirements
A machine with 192GB RAM and an RTX 3090 24GB can almost run GLM-5.2.
My machine with 192GB RAM + RTX 3090 24GB can almost run this. It says it needs 24GB of VRAM and 256GB of RAM for MoE offloading.
Quantization Analysis
Quantization analysis shows a 2.5% loss in token agreement.
It says ‘dynamic 4-bit UD-Q4_K_XL and dynamic 5-bit UD-Q5_K_XL are generally lossless’ while showing a top-1% token agreement on the chart of 97.5%. Not what I would consider ‘generally lossless’.
Conclusion
Running GLM-5.2 on local hardware is possible but requires significant resources and may result in a 2.5% loss in token agreement.
Focus Keyword: GLM-5.2