Optimization

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

Projecting out the radial component to mitigate the decay of effective step sizes for scale-invariant weights when updating with momentum-based optimizers - ___[ICLR 2021](https://iclr.cc/Conferences/2021)___