Tweeted By @rasbt
You are likely using DistilBert vs Bert due to the smaller memory footprint.
— Sebastian Raschka (@rasbt) August 5, 2022
But check this out:
🦾 7x speed-up via quantization (32 bit floats -> 8 bit ints) alone.
Cost? A meager <1% F1 score decrease 🤷♂️
If you are not already quantizing... https://t.co/xoXE0BbUW4 pic.twitter.com/GLRff3jALE