News Signals: An NLP Library for Text and Time Series

Chris Hokamp, Demian Ghalandari, Parsa Ghaffari


Abstract
We present an open-source Python library for building and using datasets where inputs are clusters of textual data, and outputs are sequences of real values representing one or more timeseries signals. The news-signals library supports diverse data science and NLP problem settings related to the prediction of time series behaviour using textual data feeds. For example, in the news domain, inputs are document clusters corresponding to daily news articles about a particular entity, and targets are explicitly associated real-valued timeseries: the volume of news about a particular person or company, or the number of pageviews of specific Wikimedia pages. Despite many industry and research usecases for this class of problem settings, to the best of our knowledge, News Signals is the only open-source library designed specifically to facilitate data science and research settings with natural language inputs and timeseries targets. In addition to the core codebase for building and interacting with datasets, we also conduct a suite of experiments using several popular Machine Learning libraries, which are used to establish baselines for timeseries anomaly prediction using textual inputs.
Anthology ID:
2023.nlposs-1.21
Volume:
Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)
Month:
December
Year:
2023
Address:
Singapore
Editors:
Liling Tan, Dmitrijs Milajevs, Geeticka Chauhan, Jeremy Gwinnup, Elijah Rippeth
Venues:
NLPOSS | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
179–189
Language:
URL:
https://aclanthology.org/2023.nlposs-1.21
DOI:
10.18653/v1/2023.nlposs-1.21
Bibkey:
Cite (ACL):
Chris Hokamp, Demian Ghalandari, and Parsa Ghaffari. 2023. News Signals: An NLP Library for Text and Time Series. In Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023), pages 179–189, Singapore. Association for Computational Linguistics.
Cite (Informal):
News Signals: An NLP Library for Text and Time Series (Hokamp et al., NLPOSS-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.nlposs-1.21.pdf