Shumin Shi


2023

pdf bib
CCL23-Eval 任务9系统报告:基于重叠片段生成增强阅读理解模型鲁棒性的方法(System Report for CCL23-Eval Task 9: Improving MRC Robustness with Overlapping Segments Generation for GCRC_advRobust)
Suzhe He (何苏哲) | Chongsheng Yang (杨崇盛) | Shumin Shi (史树敏)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“目前机器阅读理解在抽取语义完整的选项证据时存在诸多挑战。现有通过无监督方式进行证据抽取的工作主要分为两类,一是利用静态词向量,采用集束搜索迭代地提取相关句子;另一类是使用实例级监督方法,包括独立式证据抽取和端到端式证据抽取。前者处理流程上较为繁琐,后者在联合训练时存在不稳定性,直接导致模型性能难以稳定提升。在CCL23-Eval 任务9中,本文提出了一种基于重叠片段生成的自适应端到端证据抽取方法。该方法针对证据句边界不明确的问题,通过将文档划分为多个重叠的句子片段,并提取关键部分作为证据来实现整体语义的抽取。同时,将证据提取嵌入模块予以优化,实现了证据片段置信度自动调整。实验结果表明本文所提出方法能够极大地排除冗余内容干扰,仅需一个超参数即可稳定提升阅读理解模型性能,增强了模型鲁棒性。”

2017

pdf bib
QLUT at SemEval-2017 Task 2: Word Similarity Based on Word Embedding and Knowledge Base
Fanqing Meng | Wenpeng Lu | Yuteng Zhang | Ping Jian | Shumin Shi | Heyan Huang
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper shows the details of our system submissions in the task 2 of SemEval 2017. We take part in the subtask 1 of this task, which is an English monolingual subtask. This task is designed to evaluate the semantic word similarity of two linguistic items. The results of runs are assessed by standard Pearson and Spearman correlation, contrast with official gold standard set. The best performance of our runs is 0.781 (Final). The techniques of our runs mainly make use of the word embeddings and the knowledge-based method. The results demonstrate that the combined method is effective for the computation of word similarity, while the word embeddings and the knowledge-based technique, respectively, needs more deeply improvement in details.

pdf bib
A Parallel Recurrent Neural Network for Language Modeling with POS Tags
Chao Su | Heyan Huang | Shumin Shi | Yuhang Guo | Hao Wu
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation