Lung-Hao Lee

2023

pdf bib abs
NCUEE-NLP at WASSA 2023 Shared Task 1: Empathy and Emotion Prediction Using Sentiment-Enhanced RoBERTa Transformers
Tzu-Mi Lin | Jung-Ying Chang | Lung-Hao Lee
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

This paper describes our proposed system design for the WASSA 2023 shared task 1. We propose a unified architecture of ensemble neural networks to integrate the original RoBERTa transformer with two sentiment-enhanced RoBERTa-Twitter and EmoBERTa models. For Track 1 at the speech-turn level, our best submission achieved an average Pearson correlation score of 0.7236, ranking fourth for empathy, emotion polarity and emotion intensity prediction. For Track 2 at the essay-level, our best submission obtained an average Pearson correlation score of 0.4178 for predicting empathy and distress scores, ranked first among all nine submissions.

pdf bib abs
NCUEE-NLP at SemEval-2023 Task 8: Identifying Medical Causal Claims and Extracting PIO Frames Using the Transformer Models
Lung-Hao Lee | Yuan-Hao Cheng | Jen-Hao Yang | Kao-Yuan Tien
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This study describes the model design of the NCUEE-NLP system for the SemEval-2023 Task 8. We use the pre-trained transformer models and fine-tune the task datasets to identify medical causal claims and extract population, intervention, and outcome elements in a Reddit post when a claim is given. Our best system submission for the causal claim identification subtask achieved a F1-score of 70.15%. Our best submission for the PIO frame extraction subtask achieved F1-scores of 37.78% for Population class, 43.58% for Intervention class, and 30.67% for Outcome class, resulting in a macro-averaging F1-score of 37.34%. Our system evaluation results ranked second position among all participating teams.

pdf bib abs
NCUEE-NLP at SemEval-2023 Task 7: Ensemble Biomedical LinkBERT Transformers in Multi-evidence Natural Language Inference for Clinical Trial Data
Chao-Yi Chen | Kao-Yuan Tien | Yuan-Hao Cheng | Lung-Hao Lee
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This study describes the model design of the NCUEE-NLP system for the SemEval-2023 NLI4CT task that focuses on multi-evidence natural language inference for clinical trial data. We use the LinkBERT transformer in the biomedical domain (denoted as BioLinkBERT) as our main system architecture. First, a set of sentences in clinical trial reports is extracted as evidence for premise-statement inference. This identified evidence is then used to determine the inference relation (i.e., entailment or contradiction). Finally, a soft voting ensemble mechanism is applied to enhance the system performance. For Subtask 1 on textual entailment, our best submission had an F1-score of 0.7091, ranking sixth among all 30 participating teams. For Subtask 2 on evidence retrieval, our best result obtained an F1-score of 0.7940, ranking ninth of 19 submissions.

pdf bib
Overview of the ROCLING 2023 Shared Task for Chinese Multi-genre Named Entity Recognition in the Healthcare Domain
Lung-Hao Lee | Tzu-Mi Lin | Chao-Yi Chen
Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)

pdf bib abs
NCUEE-NLP at BioLaySumm Task 2: Readability-Controlled Summarization of Biomedical Articles Using the PRIMERA Models
Chao-Yi Chen | Jen-Hao Yang | Lung-Hao Lee
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

This study describes the model design of the NCUEE-NLP system for BioLaySumm Task 2 at the BioNLP 2023 workshop. We separately fine-tune pretrained PRIMERA models to independently generate technical abstracts and lay summaries of biomedical articles. A total of seven evaluation metrics across three criteria were used to compare system performance. Our best submission was ranked first for relevance, second for readability, and fourth for factuality, tying first for overall performance.

2022

pdf bib abs
NCUEE-NLP@SMM4H’22: Classification of Self-reported Chronic Stress on Twitter Using Ensemble Pre-trained Transformer Models
Tzu-Mi Lin | Chao-Yi Chen | Yu-Wen Tzeng | Lung-Hao Lee
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task

This study describes our proposed system design for the SMM4H 2022 Task 8. We fine-tune the BERT, RoBERTa, ALBERT, XLNet and ELECTRA transformers and their connecting classifiers. Each transformer model is regarded as a standalone method to detect tweets that self-reported chronic stress. The final output classification result is then combined using the majority voting ensemble mechanism. Experimental results indicate that our approach achieved a best F1-score of 0.73 over the positive class.

pdf bib abs
Enhancing Chinese Multi-Label Text Classification Performance with Response-based Knowledge Distillation
Szu-Chi Huang | Cheng-Fu Cao | Po-Hsun Liao | Lung-Hao Lee | Po-Lei Lee | Kuo-Kai Shyu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

It’s difficult to optimize individual label performance of multi-label text classification, especially in those imbalanced data containing long-tailed labels. Therefore, this study proposes a response-based knowledge distillation mechanism comprising a teacher model that optimizes binary classifiers of the corresponding labels and a student model that is a standalone multi-label classifier learning from distilled knowledge passed by the teacher model. A total of 2,724 Chinese healthcare texts were collected and manually annotated across nine defined labels, resulting in 8731 labels, each containing an average of 3.2 labels. We used 5-fold cross-validation to compare the performance of several multi-label models, including TextRNN, TextCNN, HAN, and GRU-att. Experimental results indicate that using the proposed knowledge distillation mechanism effectively improved the performance no matter which model was used, about 2-3% of micro-F1, 4-6% of macro-F1, 3-4% of weighted-F1 and 1-2% of subset accuracy for performance enhancement.

pdf bib abs
Overview of the ROCLING 2022 Shared Task for Chinese Healthcare Named Entity Recognition
Lung-Hao Lee | Chao-Yi Chen | Liang-Chih Yu | Yuen-Hsien Tseng
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

This paper describes the ROCLING-2022 shared task for Chinese healthcare named entity recognition, including task description, data preparation, performance metrics, and evaluation results. Among ten registered teams, seven participating teams submitted a total of 20 runs. This shared task reveals present NLP techniques for dealing with Chinese named entity recognition in the healthcare domain. All data sets with gold standards and evaluation scripts used in this shared task are publicly available for future research.

pdf bib abs
NCUEE-NLP at SemEval-2022 Task 11: Chinese Named Entity Recognition Using the BERT-BiLSTM-CRF Model
Lung-Hao Lee | Chien-Huan Lu | Tzu-Mi Lin
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This study describes the model design of the NCUEE-NLP system for the Chinese track of the SemEval-2022 MultiCoNER task. We use the BERT embedding for character representation and train the BiLSTM-CRF model to recognize complex named entities. A total of 21 teams participated in this track, with each team allowed a maximum of six submissions. Our best submission, with a macro-averaging F1-score of 0.7418, ranked the seventh position out of 21 teams.

2021

pdf bib
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)
Lung-Hao Lee | Chia-Hui Chang | Kuan-Yu Chen
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

pdf bib abs
Multi-Label Classification of Chinese Humor Texts Using Hypergraph Attention Networks
Hao-Chuan Kao | Man-Chen Hung | Lung-Hao Lee | Yuen-Hsien Tseng
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

We use Hypergraph Attention Networks (HyperGAT) to recognize multiple labels of Chinese humor texts. We firstly represent a joke as a hypergraph. The sequential hyperedge and semantic hyperedge structures are used to construct hyperedges. Then, attention mechanisms are adopted to aggregate context information embedded in nodes and hyperedges. Finally, we use trained HyperGAT to complete the multi-label classification task. Experimental results on the Chinese humor multi-label dataset showed that HyperGAT model outperforms previous sequence-based (CNN, BiLSTM, FastText) and graph-based (Graph-CNN, TextGCN, Text Level GNN) deep learning models.

pdf bib abs
Incorporating Domain Knowledge into Language Transformers for Multi-Label Classification of Chinese Medical Questions
Po-Han Chen | Yu-Xiang Zeng | Lung-Hao Lee
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

In this paper, we propose a knowledge infusion mechanism to incorporate domain knowledge into language transformers. Weakly supervised data is regarded as the main source for knowledge acquisition. We pre-train the language models to capture masked knowledge of focuses and aspects and then fine-tune them to obtain better performance on the downstream tasks. Due to the lack of publicly available datasets for multi-label classification of Chinese medical questions, we crawled questions from medical question/answer forums and manually annotated them using eight predefined classes: persons and organizations, symptom, cause, examination, disease, information, ingredient, and treatment. Finally, a total of 1,814 questions with 2,340 labels. Each question contains an average of 1.29 labels. We used Baidu Medical Encyclopedia as the knowledge resource. Two transformers BERT and RoBERTa were implemented to compare performance on our constructed datasets. Experimental results showed that our proposed model with knowledge infusion mechanism can achieve better performance, no matter which evaluation metric including Macro F1, Micro F1, Weighted F1 or Subset Accuracy were considered.

pdf bib abs
Generative Adversarial Networks based on Mixed-Attentions for Citation Intent Classification in Scientific Publications
Yuh-Shyang Wang | Chao-Yi Chen | Lung-Hao Lee
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

We propose the mixed-attention-based Generative Adversarial Network (named maGAN), and apply it for citation intent classification in scientific publication. We select domain-specific training data, propose a mixed-attention mechanism, and employ generative adversarial network architecture for pre-training language model and fine-tuning to the downstream multi-class classification task. Experiments were conducted on the SciCite datasets to compare model performance. Our proposed maGAN model achieved the best Macro-F1 of 0.8532.

pdf bib abs
NCU-NLP at ROCLING-2021 Shared Task: Using MacBERT Transformers for Dimensional Sentiment Analysis
Man-Chen Hung | Chao-Yi Chen | Pin-Jung Chen | Lung-Hao Lee
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

We use the MacBERT transformers and fine-tune them to ROCLING-2021 shared tasks using the CVAT and CVAS data. We compare the performance of MacBERT with the other two transformers BERT and RoBERTa in the valence and arousal dimensions, respectively. MAE and correlation coefficient (r) were used as evaluation metrics. On ROCLING-2021 test set, our used MacBERT model achieves 0.611 of MAE and 0.904 of r in the valence dimensions; and 0.938 of MAE and 0.549 of r in the arousal dimension.

pdf bib abs
Classification of Tweets Self-reporting Adverse Pregnancy Outcomes and Potential COVID-19 Cases Using RoBERTa Transformers
Lung-Hao Lee | Man-Chen Hung | Chien-Huan Lu | Chang-Hao Chen | Po-Lei Lee | Kuo-Kai Shyu
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

This study describes our proposed model design for SMM4H 2021 shared tasks. We fine-tune the language model of RoBERTa transformers and their connecting classifier to complete the classification tasks of tweets for adverse pregnancy outcomes (Task 4) and potential COVID-19 cases (Task 5). The evaluation metric is F1-score of the positive class for both tasks. For Task 4, our best score of 0.93 exceeded the mean score of 0.925. For Task 5, our best of 0.75 exceeded the mean score of 0.745.

pdf bib abs
NCUEE-NLP at MEDIQA 2021: Health Question Summarization Using PEGASUS Transformers
Lung-Hao Lee | Po-Han Chen | Yu-Xiang Zeng | Po-Lei Lee | Kuo-Kai Shyu
Proceedings of the 20th Workshop on Biomedical Language Processing

This study describes the model design of the NCUEE-NLP system for the MEDIQA challenge at the BioNLP 2021 workshop. We use the PEGASUS transformers and fine-tune the downstream summarization task using our collected and processed datasets. A total of 22 teams participated in the consumer health question summarization task of MEDIQA 2021. Each participating team was allowed to submit a maximum of ten runs. Our best submission, achieving a ROUGE2-F1 score of 0.1597, ranked third among all 128 submissions.

2020

pdf bib abs
Medication Mention Detection in Tweets Using ELECTRA Transformers and Decision Trees
Lung-Hao Lee | Po-Han Chen | Hao-Chuan Kao | Ting-Chun Hung | Po-Lei Lee | Kuo-Kai Shyu
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task

This study describes our proposed model design for the SMM4H 2020 Task 1. We fine-tune ELECTRA transformers using our trained SVM filter for data augmentation, along with decision trees to detect medication mentions in tweets. Our best F1-score of 0.7578 exceeded the mean score 0.6646 of all 15 submitting teams.

pdf bib
International Journal of Computational Linguistics & Chinese Language Processing, Volume 25, Number 2, December 2020
Lung-Hao Lee | Kuan-Yu Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 25, Number 2, December 2020

pdf bib
基於圖神經網路之中文健康照護命名實體辨識 (Chinese Healthcare Named Entity Recognition Based on Graph Neural Networks)
Yi Lu | Lung-Hao Lee
International Journal of Computational Linguistics & Chinese Language Processing, Volume 25, Number 2, December 2020

pdf bib
Gated Graph Sequence Neural Networks for Chinese Healthcare Named Entity Recognition
Yi Lu | Lung-Hao Lee
Proceedings of the 32nd Conference on Computational Linguistics and Speech Processing (ROCLING 2020)

pdf bib
Scientific Writing Evaluation Using Ensemble Multi-channel Neural Networks
Yuh-Shyang Wang | Lung-Hao Lee | Bo-Lin Lin | Liang-Chih Yu
Proceedings of the 32nd Conference on Computational Linguistics and Speech Processing (ROCLING 2020)

2019

pdf bib abs
NCUEE at MEDIQA 2019: Medical Text Inference Using Ensemble BERT-BiLSTM-Attention Model
Lung-Hao Lee | Yi Lu | Po-Han Chen | Po-Lei Lee | Kuo-Kai Shyu
Proceedings of the 18th BioNLP Workshop and Shared Task

This study describes the model design of the NCUEE system for the MEDIQA challenge at the ACL-BioNLP 2019 workshop. We use the BERT (Bidirectional Encoder Representations from Transformers) as the word embedding method to integrate the BiLSTM (Bidirectional Long Short-Term Memory) network with an attention mechanism for medical text inferences. A total of 42 teams participated in natural language inference task at MEDIQA 2019. Our best accuracy score of 0.84 ranked the top-third among all submissions in the leaderboard.

2018

pdf bib abs
Multilingual Short Text Responses Clustering for Mobile Educational Activities: a Preliminary Exploration
Yuen-Hsien Tseng | Lung-Hao Lee | Yu-Ta Chien | Chun-Yen Chang | Tsung-Yen Li
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

Text clustering is a powerful technique to detect topics from document corpora, so as to provide information browsing, analysis, and organization. On the other hand, the Instant Response System (IRS) has been widely used in recent years to enhance student engagement in class and thus improve their learning effectiveness. However, the lack of functions to process short text responses from the IRS prevents the further application of IRS in classes. Therefore, this study aims to propose a proper short text clustering module for the IRS, and demonstrate our implemented techniques through real-world examples, so as to provide experiences and insights for further study. In particular, we have compared three clustering methods and the result shows that theoretically better methods need not lead to better results, as there are various factors that may affect the final performance.

pdf bib
Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis
Lung-Hao Lee | Yuen-Hsien Tseng | Li-Ping Chang
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)
Yuen-Hsien Tseng | Hsin-Hsi Chen | Lung-Hao Lee | Liang-Chih Yu
Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)

pdf bib abs
The NTNU System at SemEval-2017 Task 10: Extracting Keyphrases and Relations from Scientific Publications Using Multiple Conditional Random Fields
Lung-Hao Lee | Kuei-Ching Lee | Yuen-Hsien Tseng
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This study describes the design of the NTNU system for the ScienceIE task at the SemEval 2017 workshop. We use self-defined feature templates and multiple conditional random fields with extracted features to identify keyphrases along with categorized labels and their relations from scientific publications. A total of 16 teams participated in evaluation scenario 1 (subtasks A, B, and C), with only 7 teams competing in all sub-tasks. Our best micro-averaging F1 across the three subtasks is 0.23, ranking in the middle among all 16 submissions.

pdf bib abs
IJCNLP-2017 Task 1: Chinese Grammatical Error Diagnosis
Gaoqi Rao | Baolin Zhang | Endong Xun | Lung-Hao Lee
Proceedings of the IJCNLP 2017, Shared Tasks

This paper presents the IJCNLP 2017 shared task for Chinese grammatical error diagnosis (CGED) which seeks to identify grammatical error types and their range of occurrence within sentences written by learners of Chinese as foreign language. We describe the task definition, data preparation, performance metrics, and evaluation results. Of the 13 teams registered for this shared task, 5 teams developed the system and submitted a total of 13 runs. We expected this evaluation campaign could lead to the development of more advanced NLP techniques for educational applications, especially for Chinese error detection. All data sets with gold standards and scoring scripts are made publicly available to researchers.

pdf bib abs
IJCNLP-2017 Task 2: Dimensional Sentiment Analysis for Chinese Phrases
Liang-Chih Yu | Lung-Hao Lee | Jin Wang | Kam-Fai Wong
Proceedings of the IJCNLP 2017, Shared Tasks

This paper presents the IJCNLP 2017 shared task on Dimensional Sentiment Analysis for Chinese Phrases (DSAP) which seeks to identify a real-value sentiment score of Chinese single words and multi-word phrases in the both valence and arousal dimensions. Valence represents the degree of pleasant and unpleasant (or positive and negative) feelings, and arousal represents the degree of excitement and calm. Of the 19 teams registered for this shared task for two-dimensional sentiment analysis, 13 submitted results. We expected that this evaluation campaign could produce more advanced dimensional sentiment analysis techniques, especially for Chinese affective computing. All data sets with gold standards and scoring script are made publicly available to researchers.

2016

pdf bib
The NTNU-YZU System in the AESW Shared Task: Automated Evaluation of Scientific Writing Using a Convolutional Neural Network
Lung-Hao Lee | Bo-Lin Lin | Liang-Chih Yu | Yuen-Hsien Tseng
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib abs
Overview of NLP-TEA 2016 Shared Task for Chinese Grammatical Error Diagnosis
Lung-Hao Lee | Gaoqi Rao | Liang-Chih Yu | Endong Xun | Baolin Zhang | Li-Ping Chang
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

This paper presents the NLP-TEA 2016 shared task for Chinese grammatical error diagnosis which seeks to identify grammatical error types and their range of occurrence within sentences written by learners of Chinese as foreign language. We describe the task definition, data preparation, performance metrics, and evaluation results. Of the 15 teams registered for this shared task, 9 teams developed the system and submitted a total of 36 runs. We expected this evaluation campaign could lead to the development of more advanced NLP techniques for educational applications, especially for Chinese error detection. All data sets with gold standards and scoring scripts are made publicly available to researchers.

This paper proposes a method to construct an evaluation dataset from microblogs for the development of recommendation systems. We extract the relationships among three main entities in a recommendation event, i.e., who recommends what to whom. User-to-user friend relationships and user-to-resource interesting relationships in social media and resource-to-metadata descriptions in an external ontology are employed. In the experiments, the resources are restricted to visual entertainment media, movies in particular. A sequence of ground truths varying with time is generated. That reflects the dynamic of real world.

pdf bib
Traditional Chinese Parsing Evaluation at SIGHAN Bake-offs 2012
Yuen-Hsien Tseng | Lung-Hao Lee | Liang-Chih Yu
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing

2009

pdf bib
CWN-LMF: Chinese WordNet in the Lexical Markup Framework
Lung-Hao Lee | Shu-Kai Hsieh | Chu-Ren Huang
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)

pdf bib
Chinese WordNet Domains: Bootstrapping Chinese WordNet with Semantic Domain Labels
Lung-Hao Lee | Yu-Ting Yu | Chu-Ren Huang
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 1

2008

pdf bib abs
Quality Assurance of Automatic Annotation of Very Large Corpora: a Study based on heterogeneous Tagging System
Chu-Ren Huang | Lung-Hao Lee | Wei-guang Qu | Jia-Fei Hong | Shiwen Yu
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We propose a set of heuristics for improving annotation quality of very large corpora efficiently. The Xinhua News portion of the Chinese Gigaword Corpus was tagged independently with both the Peking University ICL tagset and the Academia Sinica CKIP tagset. The corpus-based POS tags mapping will serve as the basis of the possible contrast in grammatical systems between PRC and Taiwan. And it can serve as the basic model for mapping between the CKIP and ICL tagging systems for any data.

pdf bib
Contrastive Approach towards Text Source Classification based on Top-Bag-of-Word Similarity
Chu-Ren Huang | Lung-Hao Lee
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation