Tweeted By @huggingface
Looking to scale BERT inference on CPU? 🧐
— Hugging Face (@huggingface) April 20, 2021
Wonder how we maximize hardware utilization while optimizing latency/throughput? 📈
Find out on our latest blog post: Scaling up BERT model inference on CPU. 👀https://t.co/48JLMFj984