Jie Shen


2023

pdf bib
基于FLAT的农业病虫害命名实体识别(Named Entity Recognition of Agricultural Pests and Diseases based on FLAT)
Yi Ren (任义) | Jie Shen (沈洁) | Shuai Yuan (袁帅)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“针对传统命名实体识别方法中词嵌入无法表征一词多义及字词融合的模型存在特征提取不够准确的问题,本文提出了一种基于FLAT的交互式特征融合模型,该模型首先通过外部词典匹配获得字、词向量,经过BERT预训练后,通过设计的交互式特征融合模块充分挖掘字词间的依赖关系。另外,引入对抗训练提升模型的鲁棒性。其次,采用了特殊的相对位置编码将数据输入到自注意力机制,最后通过CRF得到全局最优序列。本文模型在农业病虫害数据集上识别的准确率、召回率、F1值分别达到了93.76%、92.14%和92.94%。”

2016

pdf bib
Improved Word Embeddings with Implicit Structure Information
Jie Shen | Cong Liu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Distributed word representation is an efficient method for capturing semantic and syntactic word relations. In this work, we introduce an extension to the continuous bag-of-words model for learning word representations efficiently by using implicit structure information. Instead of relying on a syntactic parser which might be noisy and slow to build, we compute weights representing probabilities of syntactic relations based on the Huffman softmax tree in an efficient heuristic. The constructed “implicit graphs” from these weights show that these weights contain useful implicit structure information. Extensive experiments performed on several word similarity and word analogy tasks show gains compared to the basic continuous bag-of-words model.