Stefan Conrad


2024

pdf bib
Using Discourse Connectives to Test Genre Bias in Masked Language Models
Heidrun Dorgeloh | Lea Kawaletz | Simon Stein | Regina Stodden | Stefan Conrad
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024)

This paper presents evidence for an effect of genre on the use of discourse connectives in argumentation. Drawing from discourse processing research on reasoning based structures, we use fill-mask computation to measure genre-induced expectations of argument realisation, and beta regression to model the probabilities of these realisations against a set of predictors. Contrasting fill-mask probabilities for the presence or absence of a discourse connective in baseline and finetuned language models reveals that genre introduces biases for the realisation of argument structure. These outcomes suggest that cross-domain discourse processing, but also argument mining, should take into account generalisations about specific features, such as connectives, and their probability related to the genre context.

2023

pdf bib
HHU at SemEval-2023 Task 3: An Adapter-based Approach for News Genre Classification
Fabian Billert | Stefan Conrad
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our approach for Subtask 1 of Task 3 at SemEval-2023. In this subtask, task participants were asked to classify multilingual news articles for one of three classes: Reporting, Opinion Piece or Satire. By training an AdapterFusion layer composing the task-adapters from different languages, we successfully combine the language-exclusive knowledge and show that this improves the results in nearly all cases, including in zero-shot scenarios.

pdf bib
Team HHU at the FinNLP-2023 ML-ESG Task: A Multi-Model Approach to ESG-Key-Issue Classification
Fabian Billert | Stefan Conrad
Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting

pdf bib
Exploring Knowledge Composition for ESG Impact Type Determination
Fabian Billert | Stefan Conrad
Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing

In this paper, we discuss our (Team HHU’s) submission to the Multi-Lingual ESG Impact Type Identification task (ML-ESG-2). The goal of this task is to determine if an ESG-related news article represents an opportunity or a risk. We use an adapter-based framework in order to train multiple adapter modules which capture different parts of the knowledge present in the training data. Experimenting with various Adapter Fusion setups, we focus both on combining the ESG-aspect-specific knowledge, and on combining the language-specific-knowledge. Our results show that in both cases, it is possible to effectively compose the knowledge in order to improve the impact type determination.

2022

pdf bib
Developing an argument annotation scheme based on a semantic classification of arguments
Lea Kawaletz | Heidrun Dorgeloh | Stefan Conrad | Zeljko Bekcic
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Corpora of argumentative discourse are commonly analyzed in terms of argumentative units, consisting of claims and premises. Both argument detection and classification are complex discourse processing tasks. Our paper introduces a semantic classification of arguments that can help to facilitate argument detection. We report on our experiences with corpus annotations using a function-based classification of arguments and a procedure for operationalizing the scheme by using semantic templates.

2021

pdf bib
Citizen Involvement in Urban Planning - How Can Municipalities Be Supported in Evaluating Public Participation Processes for Mobility Transitions?
Julia Romberg | Stefan Conrad
Proceedings of the 8th Workshop on Argument Mining

Public participation processes allow citizens to engage in municipal decision-making processes by expressing their opinions on specific issues. Municipalities often only have limited resources to analyze a possibly large amount of textual contributions that need to be evaluated in a timely and detailed manner. Automated support for the evaluation is therefore essential, e.g. to analyze arguments. In this paper, we address (A) the identification of argumentative discourse units and (B) their classification as major position or premise in German public participation processes. The objective of our work is to make argument mining viable for use in municipalities. We compare different argument mining approaches and develop a generic model that can successfully detect argument structures in different datasets of mobility-related urban planning. We introduce a new data corpus comprising five public participation processes. In our evaluation, we achieve high macro F1 scores (0.76 - 0.80 for the identification of argumentative units; 0.86 - 0.93 for their classification) on all datasets. Additionally, we improve previous results for the classification of argumentative units on a similar German online participation dataset.

pdf bib
Combining text and vision in compound semantics: Towards a cognitively plausible multimodal model
Abhijeet Gupta | Fritz Günther | Ingo Plag | Laura Kallmeyer | Stefan Conrad
Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021)

2020

pdf bib
Annotating Patient Information Needs in Online Diabetes Forums
Julia Romberg | Jan Dyczmons | Sandra Olivia Borgmann | Jana Sommer | Markus Vomhof | Cecilia Brunoni | Ismael Bruck-Ramisch | Luis Enders | Andrea Icks | Stefan Conrad
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task

Identifying patient information needs is an important issue for health care services and implementation of patient-centered care. A relevant number of people with diabetes mellitus experience a need for information during the course of the disease. Health-related online forums are a promising option for researching relevant information needs closely related to everyday life. In this paper, we present a novel data corpus comprising 4,664 contributions from an online diabetes forum in German language. Two annotation tasks were implemented. First, the contributions were categorised according to whether they contain a diabetes-specific information need or not, which might either be a non diabetes-specific information need or no information need at all, resulting in an agreement of 0.89 (Krippendorff’s α). Moreover, the textual content of diabetes-specific information needs was segmented and labeled using a well-founded definition of health-related information needs, which achieved a promising agreement of 0.82 (Krippendorff’s αu). We further report a baseline for two sub-tasks of the information extraction system planned for the long term: contribution categorization and segment classification.

2019

pdf bib
HHU at SemEval-2019 Task 6: Context Does Matter - Tackling Offensive Language Identification and Categorization with ELMo
Alexander Oberstrass | Julia Romberg | Anke Stoll | Stefan Conrad
Proceedings of the 13th International Workshop on Semantic Evaluation

We present our results for OffensEval: Identifying and Categorizing Offensive Language in Social Media (SemEval 2019 - Task 6). Our results show that context embeddings are important features for the three different sub-tasks in connection with classical machine and with deep learning. Our best model reached place 3 of 75 in sub-task B with a macro F1 of 0.719. Our approaches for sub-task A and C perform less well but could also deliver promising results.

2018

pdf bib
HHU at SemEval-2018 Task 12: Analyzing an Ensemble-based Deep Learning Approach for the Argument Mining Task of Choosing the Correct Warrant
Matthias Liebeck | Andreas Funke | Stefan Conrad
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper describes our participation in the SemEval-2018 Task 12 Argument Reasoning Comprehension Task which calls to develop systems that, given a reason and a claim, predict the correct warrant from two opposing options. We decided to use a deep learning architecture and combined 623 models with different hyperparameters into an ensemble. Our extensive analysis of our architecture and ensemble reveals that the decision to use an ensemble was suboptimal. Additionally, we benchmark a support vector machine as a baseline. Furthermore, we experimented with an alternative data split and achieved more stable results.

2017

pdf bib
HHU at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Data using Machine Learning Methods
Tobias Cabanski | Julia Romberg | Stefan Conrad
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

In this Paper a system for solving SemEval-2017 Task 5 is presented. This task is divided into two tracks where the sentiment of microblog messages and news headlines has to be predicted. Since two submissions were allowed, two different machine learning methods were developed to solve this task, a support vector machine approach and a recurrent neural network approach. To feed in data for these approaches, different feature extraction methods are used, mainly word representations and lexica. The best submissions for both tracks are provided by the recurrent neural network which achieves a F1-score of 0.729 in track 1 and 0.702 in track 2.

2016

pdf bib
What to Do with an Airport? Mining Arguments in the German Online Participation Project Tempelhofer Feld
Matthias Liebeck | Katharina Esau | Stefan Conrad
Proceedings of the Third Workshop on Argument Mining (ArgMining2016)

pdf bib
HHU at SemEval-2016 Task 1: Multiple Approaches to Measuring Semantic Textual Similarity
Matthias Liebeck | Philipp Pollack | Pashutan Modaresi | Stefan Conrad
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
IWNLP: Inverse Wiktionary for Natural Language Processing
Matthias Liebeck | Stefan Conrad
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2013

pdf bib
Opinion Mining in Newspaper Articles by Entropy-Based Word Connections
Thomas Scholz | Stefan Conrad
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing