Klara Ceberio


2010

pdf bib
A Morphological Processor Based on Foma for Biscayan (a Basque dialect)
Iñaki Alegria | Garbiñe Aranbarri | Klara Ceberio | Gorka Labaka | Bittor Laskurain | Ruben Urizar
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We present a new morphological processor for Biscayan, a dialect of Basque, developed on the description of the morphology of standard Basque. The database for the standard morphology has been extended for dialects and an open-source tool for morphological description named foma is used for building the processor. Biscayan is a dialect of the Basque language spoken mainly in Biscay, a province on the western of the Basque Country. The description of the lexicon and the morphotactics (or word grammar) for the standard Basque was carried out using a relational database and the database has been extended in order to include dialectal variants linked to the standard entries. XuxenB, a spelling checker/corrector for this dialect, is the first application of this work. Additionally to the basic analyzer used for spelling, a new transducer is included. It is an enhanced analyzer for linking standard form with the corresponding standard ones. It is used in correction for generation of proposals when in the input text appear standard forms which we want to replace with dialectal forms.

2008

pdf bib
Spelling Correction: from Two-Level Morphology to Open Source
Iñaki Alegria | Klara Ceberio | Nerea Ezeiza | Aitor Soroa | Gregorio Hernandez
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Basque is a highly inflected and agglutinative language (Alegria et al., 1996). Two-level morphology has been applied successfully to this kind of languages and there are two-level based descriptions for very different languages. After doing the morphological description for a language, it is easy to develop a spelling checker/corrector for this language. However, what happens if we want to use the speller in the “free world” (OpenOffice, Mozilla, emacs, LaTeX, etc.)? Ispell and similar tools (aspell, hunspell, myspell) are the usual mechanisms for these purposes, but they do not fit the two-level model. In the absence of two-level morphology based mechanisms, an automatic conversion from two-level description to hunspell is described in this paper.