Tweeted By @rasbt
Currently toying around with *gradient checkpointing* to fit some of my larger DL models into VRAM. Such a simple an neat trick for more memory-efficient backpropagation. Great article here: https://t.co/yeHWZiSYYw. There's also a PyTorch implementation: https://t.co/uXz7C5vCdM pic.twitter.com/KM7wIOeEvb
— Sebastian Raschka (@rasbt) December 22, 2020