Tweeted By @stanfordnlp
“Nvidia was able to train BERT-Large using optimized PyTorch software and a DGX-SuperPOD of more than 1,000 GPUs that is able to train BERT in 53 minutes.” – @kharijohnson, @VentureBeat https://t.co/9gT3aZTsBs
— Stanford NLP Group (@stanfordnlp) August 14, 2019