Memory

Large Product Key Memory for Pretrained Language Models

Improving accuracy and speed trade-off when finetuning pretrained language models by using large product key memory and mitigating a catastrophic drift with initialization and residual memory - ___[Findings of EMNLP 2020](https://2020.emnlp.org/)___