Tweeted By @PyTorch
Cloud TPUs are powerful hardware for building large models & have enabled many research successes. We enable scaling up PyTorch models to 10B+ parameters on Cloud TPU with a new Fully Sharded Data Parallel (FSDP) interface in PyTorch/XLA.
— PyTorch (@PyTorch) October 13, 2022