Aysu Ezen-Can


2021

pdf bib
Error-Sensitive Evaluation for Ordinal Target Variables
David Chen | Maury Courtland | Adam Faulkner | Aysu Ezen-Can
Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems

Product reviews and satisfaction surveys seek customer feedback in the form of ranked scales. In these settings, widely used evaluation metrics including F1 and accuracy ignore the rank in the responses (e.g., ‘very likely’ is closer to ‘likely’ than ‘not at all’). In this paper, we hypothesize that the order of class values is important for evaluating classifiers on ordinal target variables and should not be disregarded. To test this hypothesis, we compared Multi-class Classification (MC) and Ordinal Regression (OR) by applying OR and MC to benchmark tasks involving ordinal target variables using the same underlying model architecture. Experimental results show that while MC outperformed OR for some datasets in accuracy and F1, OR is significantly better than MC for minimizing the error between prediction and target for all benchmarks, as revealed by error-sensitive metrics, e.g. mean-squared error (MSE) and Spearman correlation. Our findings motivate the need to establish consistent, error-sensitive metrics for evaluating benchmarks with ordinal target variables, and we hope that it stimulates interest in exploring alternative losses for ordinal problems.

2019

pdf bib
Hybrid RNN at SemEval-2019 Task 9: Blending Information Sources for Domain-Independent Suggestion Mining
Aysu Ezen-Can | Ethem F. Can
Proceedings of the 13th International Workshop on Semantic Evaluation

Social media has an increasing amount of information that both customers and companies can benefit from. These social media posts can include Tweets or be in the form of vocalization of complements and complaints (e.g., reviews) of a product or service. Researchers have been actively mining this invaluable information source to automatically generate insights. Mining sentiments of customer reviews is an example that has gained momentum due to its potential to gather information that customers are not happy about. Instead of reading millions of reviews, companies prefer sentiment analysis to obtain feedback and to improve their products or services. In this work, we aim to identify information that companies can act on, or other customers can utilize for making their own experience better. This is different from identifying if reviews of a product or service is negative, positive, or neutral. To that end, we classify sentences of a given review as suggestion or not suggestion so that readers of the reviews do not have to go through thousands of reviews but instead can focus on actionable items and applicable suggestions. To identify suggestions within reviews, we employ a hybrid approach that utilizes a recurrent neural network (RNN) along with rule-based features to build a domain-independent suggestion mining model. In this way, a model trained on electronics reviews is used to extract suggestions from hotel reviews.

2018

pdf bib
RNN for Affects at SemEval-2018 Task 1: Formulating Affect Identification as a Binary Classification Problem
Aysu Ezen-Can | Ethem F. Can
Proceedings of the 12th International Workshop on Semantic Evaluation

Written communication lacks the multimodal features such as posture, gesture and gaze that make it easy to model affective states. Especially in social media such as Twitter, due to the space constraints, the sources of information that can be mined are even more limited due to character limitations. These limitations constitute a challenge for understanding short social media posts. In this paper, we present an approach that utilizes multiple binary classifiers that represent different affective categories to model Twitter posts (e.g., tweets). We train domain-independent recurrent neural network models without any outside information such as affect lexicons. We then use these domain independent binary ranking models to evaluate the applicability of such deep learning models on the affect identification task. This approach allows different model architectures and parameter settings for each affect category instead of building one single multi-label classifier. The contributions of this paper are two-folds: we show that modeling tweets with a small training set is possible with the use of RNNs and we also prove that formulating affect identification as a binary classification task is highly effective.

2014

pdf bib
Combining Task and Dialogue Streams in Unsupervised Dialogue Act Models
Aysu Ezen-Can | Kristy Boyer
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

2013

pdf bib
In-Context Evaluation of Unsupervised Dialogue Act Models for Tutorial Dialogue
Aysu Ezen-Can | Kristy Boyer
Proceedings of the SIGDIAL 2013 Conference