Tip: when running on TPU, you can significantly speed up your model by running *multiple steps of gradient descent in a single graph execution*. This helps get the device to 100% utilization (which, for a TPU, is huge).
— François Chollet (@fchollet) August 18, 2020
Just pass `experimental_steps_per_execution` to `compile`. pic.twitter.com/cwzk27z5Lo