Tweeted By @PyTorch

on 2021-02-25 (UTC)
pytorch tool

FairScale, a PyTorch extension for efficient large scale training, is releasing FullyShardedDataParallel, which shards model params across GPUs (+offload to CPU). Details: https://t.co/xshPfLeXyr. Inspired by DeepSpeed/@MSFTResearch, and made by @myleott @m1nxu @sam_shleifer pic.twitter.com/1ICMsJwtUP
— PyTorch (@PyTorch) February 25, 2021

Tweeted By @PyTorch

Tags