Tweeted By @Tim_Dettmers

on 2022-08-17 (UTC)
research tool

We release LLM.int8(), the first 8-bit inference method that saves 2x memory and does not degrade performance for 175B models by exploiting emergent properties. Read More:

Paper: https://t.co/eNpinXS0Z5
Software: https://t.co/hBuVyQhLqS
Emergence: https://t.co/oPGRhACNEe pic.twitter.com/vNWxrDHlOh
— Tim Dettmers (@Tim_Dettmers) August 17, 2022

Tweeted By @Tim_Dettmers

Tags