Tweeted By @srchvrs
Bing uses a 3-layer BERT-like transformer for every query. In addition, to model simplification and possibly distillation, they implement more efficient GPU code. https://t.co/9irfYz1DCw
— Leonid Boytsov (@srchvrs) November 19, 2019