Guido Zuccon


2023

pdf bib
Open-source Large Language Models are Strong Zero-shot Query Likelihood Models for Document Ranking
Shengyao Zhuang | Bing Liu | Bevan Koopman | Guido Zuccon
Findings of the Association for Computational Linguistics: EMNLP 2023

In the field of information retrieval, Query Likelihood Models (QLMs) rank documents based on the probability of generating the query given the content of a document. Recently, advanced large language models (LLMs) have emerged as effective QLMs, showcasing promising ranking capabilities. This paper focuses on investigating the genuine zero-shot ranking effectiveness of recent LLMs, which are solely pre-trained on unstructured text data without supervised instruction fine-tuning. Our findings reveal the robust zero-shot ranking ability of such LLMs, highlighting that additional instruction fine-tuning may hinder effectiveness unless a question generation task is present in the fine-tuning dataset. Furthermore, we introduce a novel state-of-the-art ranking system that integrates LLM-based QLMs with a hybrid zero-shot retriever, demonstrating exceptional effectiveness in both zero-shot and few-shot scenarios. We make our codebase publicly available at https://github.com/ielab/llm-qlm.

pdf bib
Dr ChatGPT tell me what I want to hear: How different prompts impact health answer correctness
Bevan Koopman | Guido Zuccon
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

This paper investigates the significant impact different prompts have on the behaviour of ChatGPT when used for health information seeking. As people more and more depend on generative large language models (LLMs) like ChatGPT, it is critical to understand model behaviour under different conditions, especially for domains where incorrect answers can have serious consequences such as health. Using the TREC Misinformation dataset, we empirically evaluate ChatGPT to show not just its effectiveness but reveal that knowledge passed in the prompt can bias the model to the detriment of answer correctness. We show this occurs both for retrieve-then-generate pipelines and based on how a user phrases their question as well as the question type. This work has important implications for the development of more robust and transparent question-answering systems based on generative large language models. Prompts, raw result files and manual analysis are made publicly available at https://github.com/ielab/drchatgpt-health_prompting.

2022

pdf bib
Guiding Neural Entity Alignment with Compatibility
Bing Liu | Harrisen Scells | Wen Hua | Guido Zuccon | Genghong Zhao | Xia Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Entity Alignment (EA) aims to find equivalent entities between two Knowledge Graphs (KGs). While numerous neural EA models have been devised, they are mainly learned using labelled data only. In this work, we argue that different entities within one KG should have compatible counterparts in the other KG due to the potential dependencies among the entities. Making compatible predictions thus should be one of the goals of training an EA model along with fitting the labelled data: this aspect however is neglected in current methods. To power neural EA models with compatibility, we devise a training framework by addressing three problems: (1) how to measure the compatibility of an EA model; (2) how to inject the property of being compatible into an EA model; (3) how to optimise parameters of the compatibility model. Extensive experiments on widely-used datasets demonstrate the advantages of integrating compatibility within EA models. In fact, state-of-the-art neural EA models trained within our framework using just 5% of the labelled data can achieve comparable effectiveness with supervised training using 20% of the labelled data.

2021

pdf bib
Dealing with Typos for BERT-based Passage Retrieval and Ranking
Shengyao Zhuang | Guido Zuccon
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Passage retrieval and ranking is a key task in open-domain question answering and information retrieval. Current effective approaches mostly rely on pre-trained deep language model-based retrievers and rankers. These methods have been shown to effectively model the semantic matching between queries and passages, also in presence of keyword mismatch, i.e. passages that are relevant to a query but do not contain important query keywords. In this paper we consider the Dense Retriever (DR), a passage retrieval method, and the BERT re-ranker, a popular passage re-ranking method. In this context, we formally investigate how these models respond and adapt to a specific type of keyword mismatch – that caused by keyword typos occurring in queries. Through empirical investigation, we find that typos can lead to a significant drop in retrieval and ranking effectiveness. We then propose a simple typos-aware training framework for DR and BERT re-ranker to address this issue. Our experimental results on the MS MARCO passage ranking dataset show that, with our proposed typos-aware training, DR and BERT re-ranker can become robust to typos in queries, resulting in significantly improved effectiveness compared to models trained without appropriately accounting for typos.

pdf bib
ActiveEA: Active Learning for Neural Entity Alignment
Bing Liu | Harrisen Scells | Guido Zuccon | Wen Hua | Genghong Zhao
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Entity Alignment (EA) aims to match equivalent entities across different Knowledge Graphs (KGs) and is an essential step of KG fusion. Current mainstream methods – neural EA models – rely on training with seed alignment, i.e., a set of pre-aligned entity pairs which are very costly to annotate. In this paper, we devise a novel Active Learning (AL) framework for neural EA, aiming to create highly informative seed alignment to obtain more effective EA models with less annotation cost. Our framework tackles two main challenges encountered when applying AL to EA: (1) How to exploit dependencies between entities within the AL strategy. Most AL strategies assume that the data instances to sample are independent and identically distributed. However, entities in KGs are related. To address this challenge, we propose a structure-aware uncertainty sampling strategy that can measure the uncertainty of each entity as well as its impact on its neighbour entities in the KG. (2) How to recognise entities that appear in one KG but not in the other KG (i.e., bachelors). Identifying bachelors would likely save annotation budget. To address this challenge, we devise a bachelor recognizer paying attention to alleviate the effect of sampling bias. Empirical results show that our proposed AL strategy can significantly improve sampling quality with good generality across different datasets, EA models and amount of bachelors.

pdf bib
Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association
Afshin Rahimi | William Lane | Guido Zuccon
Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association

2016

pdf bib
The Benefits of Word Embeddings Features for Active Learning in Clinical Information Extraction
Mahnoosh Kholghi | Lance De Vine | Laurianne Sitbon | Guido Zuccon | Anthony Nguyen
Proceedings of the Australasian Language Technology Association Workshop 2016

pdf bib
Building Evaluation Datasets for Consumer-Oriented Information Retrieval
Lorraine Goeuriot | Liadh Kelly | Guido Zuccon | Joao Palotti
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Common people often experience difficulties in accessing relevant, correct, accurate and understandable health information online. Developing search techniques that aid these information needs is challenging. In this paper we present the datasets created by CLEF eHealth Lab from 2013-2015 for evaluation of search solutions to support common people finding health information online. Specifically, the CLEF eHealth information retrieval (IR) task of this Lab has provided the research community with benchmarks for evaluating consumer-centered health information retrieval, thus fostering research and development aimed to address this challenging problem. Given consumer queries, the goal of the task is to retrieve relevant documents from the provided collection of web pages. The shared datasets provide a large health web crawl, queries representing people’s real world information needs, and relevance assessment judgements for the queries.

2015

pdf bib
Analysis of Word Embeddings and Sequence Features for Clinical Information Extraction
Lance De Vine | Mahnoosh Kholghi | Guido Zuccon | Laurianne Sitbon | Anthony Nguyen
Proceedings of the Australasian Language Technology Association Workshop 2015

2012

pdf bib
Semantic Judgement of Medical Concepts: Combining Syntagmatic and Paradigmatic Information with the Tensor Encoding Model
Michael Symonds | Guido Zuccon | Bevan Koopman | Peter Bruza | Anthony Nguyen
Proceedings of the Australasian Language Technology Association Workshop 2012