|
How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives
Xinpeng Wang, Leonie Weissweiler, Hinrich Schütze, Barbara Plank
ACL, 2023
arxiv /
code /
We showed that using lower teacher layers for pre-loading student model gives significant performance improvement compared to higher layers.
We also studied the robustness of different distillation objectives under various initialisation choices.
|
|
Xinpeng Wang, Chandan Yeshwanth, Matthias Nießner
3DV, 2021
oral
arxiv /
video /
code /
We proposed a transformer model for scene generation conditioned on room layout and text description.
|
Projects
These include coursework and practical course projects.
|
|
Domain Specific Multi-Lingually Aligned Word Embeddings
Machine Learning for Natural Language Processing Applications
2021-07
report /
|
|
Curiosity Driven Learning
Advanced Deep Learning in Robotics
2021-03
report /
Evaluated and compared the count-based and prediction-based curiosity driven learning on different Atari game environments.
|
|
Introduction to Deep Learning (IN2346)
SS 2020, WS2020/2021
Teaching Assistant
website /
|
|