Tweeted By @thukeg

on 2022-10-10 (UTC)
nlp research

GLM-130B reaches INT4 quantization w/ no perf degradation, allowing effective inference on 4*3090 or 8*2080 Ti GPUs, the most ever affordable GPUs required for using 100B-scale models?

Paper: https://t.co/f2bj1N8JTN
Model weights & code & demo & lessons: https://t.co/aKZNGEDmks pic.twitter.com/kVRV0b8Y56
— Tsinghua KEG (@thukeg) October 10, 2022

Tweeted By @thukeg

Tags