Multi-Strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension

Yu Xiaoyan; Liu Qingbin; He Shizhu (世柱 何); Liu Kang (康 刘); Liu Shengping; Zhao Jun (军 赵); Zhou Yongbin

Multi-Strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension

Yu Xiaoyan, Liu Qingbin, He Shizhu, Liu Kang, Liu Shengping, Zhao Jun, Zhou Yongbin

Abstract

The irrelevant information in documents poses a great challenge for machine reading compre-hension (MRC). To deal with such a challenge current MRC models generally fall into twoseparate parts: evidence extraction and answer prediction where the former extracts the key evi-dence corresponding to the question and the latter predicts the answer based on those sentences. However such pipeline paradigms tend to accumulate errors i.e. extracting the incorrect evi-dence results in predicting the wrong answer. In order to address this problem we propose aMulti-Strategy Knowledge Distillation based Teacher-Student framework (MSKDTS) for ma-chine reading comprehension. In our approach we first take evidence and document respec-tively as the input reference information to build a teacher model and a student model. Then the multi-strategy knowledge distillation method transfers the knowledge from the teacher model to the student model at both feature and prediction level through knowledge distillation approach. Therefore in the testing phase the enhanced student model can predict answer similar to the teacher model without being aware of which sentence is the corresponding evidence in the docu-ment. Experimental results on the ReCO dataset demonstrate the effectiveness of our approachand further ablation studies prove the effectiveness of both knowledge distillation strategies.

Anthology ID:: 2021.ccl-1.91
Volume:: Proceedings of the 20th Chinese National Conference on Computational Linguistics
Month:: August
Year:: 2021
Address:: Huhhot, China
Editors:: Sheng Li (李生), Maosong Sun (孙茂松), Yang Liu (刘洋), Hua Wu (吴华), Kang Liu (刘康), Wanxiang Che (车万翔), Shizhu He (何世柱), Gaoqi Rao (饶高琦)
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 1024–1036
Language:: English
URL:: https://aclanthology.org/2021.ccl-1.91
DOI:
Bibkey:
Cite (ACL):: Yu Xiaoyan, Liu Qingbin, He Shizhu, Liu Kang, Liu Shengping, Zhao Jun, and Zhou Yongbin. 2021. Multi-Strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension. In Proceedings of the 20th Chinese National Conference on Computational Linguistics, pages 1024–1036, Huhhot, China. Chinese Information Processing Society of China.
Cite (Informal):: Multi-Strategy Knowledge Distillation Based Teacher-Student Framework for Machine Reading Comprehension (Xiaoyan et al., CCL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.ccl-1.91.pdf
Data: ReCO

PDF Cite Search