Matthias Richter


2008

pdf bib
Tapping Huge Temporally Indexed Textual Resources with WCTAnalyze
Sebastian Gottwald | Matthias Richter | Gerhard Heyer | Gerik Scheuermann
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

WCTAnalyze is a tool for storing, accessing and visually analyzing huge collections of temporally indexed data. It is motivated by applications in media analysis, business intelligence etc. where higher level analysis is performed on top of linguistically and statistically processed unstructured textual data. WCTAnalyze combines fast access with economically storage behaviour and appropriates a lot of built in visualization options for result presentation in detail as well as in contrast. So it enables an efficient and effective way to explore chronological text patterns of word forms, their co-occurrence sets and co-occurrence set intersections. Digging deep into co-occurrences of the same semantic or syntactic describing wordforms, some entities can be recognized as to be temporal related, whereas other differ significantly. This behaviour motivates approaches in interactive discovering events based on co-occurrence subsets.

2007

pdf bib
Íslenskur Orðasjóður – Building a Large Icelandic Corpus
Erla Hallsteinsdóttir | Thomas Eckart | Chris Biemann | Uwe Quasthoff | Matthias Richter
Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007)

2006

pdf bib
Corpus Portal for Search in Monolingual Corpora
Uwe Quasthoff | Matthias Richter | Christian Biemann
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

A simple and flexible schema for storing and presenting monolingual language resources is proposed. In this format, data for 18 different languages is already available in various sizes. The data is provided free of charge for online use and download. The main target is to ease the application of algorithms for monolingual and interlingual studies.