General Cross-Architecture Distillation of Pretrained Language Models into Matrix Embeddings

Date:

I presented the results of the paper with the same title in a 20 minutes talk.