Knowledge distillation from text BERT to speech model by matching sequence-level contextualized representations in pretraining and predicted logits in finetuning - ___[ICASSP 2021](https://2021.ieeeicassp.org)___
Speech-text cross-modal pretraining with cross-modal masked language modeling (CM-MLM) and cross-modal conditioned language modeling (CM-CLM) - ___[ICASSP 2021](https://2021.ieeeicassp.org)___