Michael Cysouw


2016

pdf bib
Concepticon: A Resource for the Linking of Concept Lists
Johann-Mattis List | Michael Cysouw | Robert Forkel
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present an attempt to link the large amount of different concept lists which are used in the linguistic literature, ranging from Swadesh lists in historical linguistics to naming tests in clinical studies and psycholinguistics. This resource, our Concepticon, links 30 222 concept labels from 160 conceptlists to 2495 concept sets. Each concept set is given a unique identifier, a unique label, and a human-readable definition. Concept sets are further structured by defining different relations between the concepts. The resource can be used for various purposes. Serving as a rich reference for new and existing databases in diachronic and synchronic linguistics, it allows researchers a quick access to studies on semantic change, cross-linguistic polysemies, and semantic associations.

2014

pdf bib
Creating a massively parallel Bible corpus
Thomas Mayer | Michael Cysouw
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present our ongoing effort to create a massively parallel Bible corpus. While an ever-increasing number of Bible translations is available in electronic form on the internet, there is no large-scale parallel Bible corpus that allows language researchers to easily get access to the texts and their parallel structure for a large variety of different languages. We report on the current status of the corpus, with over 900 translations in more than 830 language varieties. All translations are tokenized (e.g., separating punctuation marks) and Unicode normalized. Mainly due to copyright restrictions only portions of the texts are made publicly available. However, we provide co-occurrence information for each translation in a (sparse) matrix format. All word forms in the translation are given together with their frequency and the verses in which they occur.

2012

pdf bib
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Miriam Butt | Sheelagh Carpendale | Gerald Penn | Jelena Prokić | Michael Cysouw
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH

pdf bib
Introduction
Miriam Butt | Jelena Prokić | Thomas Mayer | Michael Cysouw
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH

pdf bib
Language comparison through sparse multilingual word alignment
Thomas Mayer | Michael Cysouw
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH

2007

pdf bib
Cognate Identification and Alignment Using Practical Orthographies
Michael Cysouw | Hagen Jung
Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology