Sen Yang


2023

pdf bib
Once Upon a Time in Graph: Relative-Time Pretraining for Complex Temporal Reasoning
Sen Yang | Xin Li | Lidong Bing | Wai Lam
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Our physical world is constantly evolving over time, rendering challenges for pre-trained language models to understand and reason over the temporal contexts of texts. Existing work focuses on strengthening the direct association between a piece of text and its time-stamp. However, the knowledge-time association is usually insufficient for the downstream tasks that require reasoning over temporal dependencies between knowledge. In this work, we make use of the underlying nature of time, all temporally-scoped sentences are strung together through a one-dimensional time axis, and suggest creating a graph structure based on the relative placements of events along the time axis. Inspired by the graph view, we propose RemeMo ( ̲Relative Ti ̲me  ̲Modeling), which explicitly connects all temporally-scoped facts by modeling the time relations between any two sentences. Experimental results show that RemeMo outperforms the baseline T5 on multiple temporal question answering datasets under various settings. Further analysis suggests that RemeMo is especially good at modeling long-range complex temporal dependencies.

pdf bib
Rutgers Multimedia Image Processing Lab at SemEval-2023 Task-1: Text-Augmentation-based Approach for Visual Word Sense Disambiguation
Keyi Li | Sen Yang | Chenyang Gao | Ivan Marsic
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our system used in SemEval-2023 Task-1: Visual Word Sense Disambiguation (VWSD). The VWSD task is to identify the correct image that corresponds to an ambiguous target word given limited textual context. To reduce word ambiguity and enhance image selection, we proposed several text augmentation techniques, such as prompting, WordNet synonyms, and text generation. We experimented with different vision-language pre-trained models to capture the joint features of the augmented text and image. Our approach achieved the best performance using a combination of GPT-3 text generation and the CLIP model. On the multilingual test sets, our system achieved an average hit rate (at top-1) of 51.11 and a mean reciprocal rank of 65.69.

pdf bib
Enhancing Grammatical Error Correction Systems with Explanations
Yuejiao Fei | Leyang Cui | Sen Yang | Wai Lam | Zhenzhong Lan | Shuming Shi
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Grammatical error correction systems improve written communication by detecting and correcting language mistakes. To help language learners better understand why the GEC system makes a certain correction, the causes of errors (evidence words) and the corresponding error types are two key factors. To enhance GEC systems with explanations, we introduce EXPECT, a large dataset annotated with evidence words and grammatical error types. We propose several baselines and anlysis to understand this task. Furthermore, human evaluation verifies our explainable GEC system’s explanations can assist second-language learners in determining whether to accept a correction suggestion and in understanding the associated grammar rule.

pdf bib
Local Interpretation of Transformer Based on Linear Decomposition
Sen Yang | Shujian Huang | Wei Zou | Jianbing Zhang | Xinyu Dai | Jiajun Chen
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In recent years, deep neural networks (DNNs) have achieved state-of-the-art performance on a wide range of tasks. However, limitations in interpretability have hindered their applications in the real world. This work proposes to interpret neural networks by linear decomposition and finds that the ReLU-activated Transformer can be considered as a linear model on a single input. We further leverage the linearity of the model and propose a linear decomposition of the model output to generate local explanations. Our evaluation of sentiment classification and machine translation shows that our method achieves competitive performance in efficiency and fidelity of explanation. In addition, we demonstrate the potential of our approach in applications with examples of error analysis on multiple tasks.

2022

pdf bib
Challenges to Open-Domain Constituency Parsing
Sen Yang | Leyang Cui | Ruoxi Ning | Di Wu | Yue Zhang
Findings of the Association for Computational Linguistics: ACL 2022

Neural constituency parsers have reached practical performance on news-domain benchmarks. However, their generalization ability to other domains remains weak. Existing findings on cross-domain constituency parsing are only made on a limited number of domains. Tracking this, we manually annotate a high-quality constituency treebank containing five domains. We analyze challenges to open-domain constituency parsing using a set of linguistic features on various strong constituency parsers. Primarily, we find that 1) BERT significantly increases parsers’ cross-domain performance by reducing their sensitivity on the domain-variant features.2) Compared with single metrics such as unigram distribution and OOV rate, challenges to open-domain constituency parsing arise from complex features, including cross-domain lexical and constituent structure variations.

pdf bib
Investigating Non-local Features for Neural Constituency Parsing
Leyang Cui | Sen Yang | Yue Zhang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Thanks to the strong representation power of neural encoders, neural chart-based parsers have achieved highly competitive performance by using local features. Recently, it has been shown that non-local features in CRF structures lead to improvements. In this paper, we investigate injecting non-local features into the training process of a local span-based parser, by predicting constituent n-gram non-local patterns and ensuring consistency between non-local patterns and local constituents. Results show that our simple method gives better results than the self-attentive parser on both PTB and CTB. Besides, our method achieves state-of-the-art BERT-based performance on PTB (95.92 F1) and strong performance on CTB (92.31 F1). Our parser also outperforms the self-attentive parser in multi-lingual and zero-shot cross-domain settings.

pdf bib
Cross-domain Generalization for AMR Parsing
Xuefeng Bai | Sen Yang | Leyang Cui | Linfeng Song | Yue Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Abstract Meaning Representation (AMR) parsing aims to predict an AMR graph from textual input. Recently, there has been notable growth in AMR parsing performance. However, most existing work focuses on improving the performance in the specific domain, ignoring the potential domain dependence of AMR parsing systems. To address this, we extensively evaluate five representative AMR parsers on five domains and analyze challenges to cross-domain AMR parsing. We observe that challenges to cross-domain AMR parsing mainly arise from the distribution shift of words and AMR concepts. Based on our observation, we investigate two approaches to reduce the domain distribution divergence of text and AMR features, respectively. Experimental results on two out-of-domain test sets show the superiority of our method.

2021

pdf bib
Template-Based Named Entity Recognition Using BART
Leyang Cui | Yu Wu | Jian Liu | Sen Yang | Yue Zhang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Diversity and Consistency: Exploring Visual Question-Answer Pair Generation
Sen Yang | Qingyu Zhou | Dawei Feng | Yang Liu | Chao Li | Yunbo Cao | Dongsheng Li
Findings of the Association for Computational Linguistics: EMNLP 2021

Although showing promising values to downstream applications, generating question and answer together is under-explored. In this paper, we introduce a novel task that targets question-answer pair generation from visual images. It requires not only generating diverse question-answer pairs but also keeping the consistency of them. We study different generation paradigms for this task and propose three models: the pipeline model, the joint model, and the sequential model. We integrate variational inference into these models to achieve diversity and consistency. We also propose region representation scaling and attention alignment to improve the consistency further. We finally devise an evaluator as a quantitative metric for consistency. We validate our approach on two benchmarks, VQA2.0 and Visual-7w, by automatically and manually evaluating diversity and consistency. Experimental results show the effectiveness of our models: they can generate diverse or consistent pairs. Moreover, this task can be used to improve visual question generation and visual question answering.

2020

pdf bib
What Have We Achieved on Text Summarization?
Dandan Huang | Leyang Cui | Sen Yang | Guangsheng Bao | Kun Wang | Jun Xie | Yue Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep learning has led to significant improvement in text summarization with various methods investigated and improved ROUGE scores reported over the years. However, gaps still exist between summaries produced by automatic summarizers and human professionals. Aiming to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level, we consult the Multidimensional Quality Metric (MQM) and quantify 8 major sources of errors on 10 representative summarization models manually. Primarily, we find that 1) under similar settings, extractive summarizers are in general better than their abstractive counterparts thanks to strength in faithfulness and factual-consistency; 2) milestone techniques such as copy, coverage and hybrid extractive/abstractive methods do bring specific improvements but also demonstrate limitations; 3) pre-training techniques, and in particular sequence-to-sequence pre-training, are highly effective for improving text summarization, with BART giving the best results.

pdf bib
Making the Best Use of Review Summary for Sentiment Analysis
Sen Yang | Leyang Cui | Jun Xie | Yue Zhang
Proceedings of the 28th International Conference on Computational Linguistics

Sentiment analysis provides a useful overview of customer review contents. Many review websites allow a user to enter a summary in addition to a full review. Intuitively, summary information may give additional benefit for review sentiment analysis. In this paper, we conduct a study to exploit methods for better use of summary information. We start by finding out that the sentimental signal distribution of a review and that of its corresponding summary are in fact complementary to each other. We thus explore various architectures to better guide the interactions between the two and propose a hierarchically-refined review-centric attention model. Empirical results show that our review-centric model can make better use of user-written summaries for review sentiment analysis, and is also more effective compared to existing methods when the user summary is replaced with summary generated by an automatic summarization system.

2019

pdf bib
Exploring Pre-trained Language Models for Event Extraction and Generation
Sen Yang | Dawei Feng | Linbo Qiao | Zhigang Kan | Dongsheng Li
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Traditional approaches to the task of ACE event extraction usually depend on manually annotated data, which is often laborious to create and limited in size. Therefore, in addition to the difficulty of event extraction itself, insufficient training data hinders the learning process as well. To promote event extraction, we first propose an event extraction model to overcome the roles overlap problem by separating the argument prediction in terms of roles. Moreover, to address the problem of insufficient training data, we propose a method to automatically generate labeled data by editing prototypes and screen out generated samples by ranking the quality. Experiments on the ACE2005 dataset demonstrate that our extraction model can surpass most existing extraction methods. Besides, incorporating our generation method exhibits further significant improvement. It obtains new state-of-the-art results on the event extraction task, including pushing the F1 score of trigger classification to 81.1%, and the F1 score of argument classification to 58.9%.

2015

pdf bib
Topic-Based Chinese Message Polarity Classification System at SIGHAN8-Task2
Chun Liao | Chong Feng | Sen Yang | Heyan Huang
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing