Svetlana Sheremetyeva

2020

pdf bib abs
Towards Creating Interoperable Resources for Conceptual Annotation of Multilingual Domain Corpora
Svetlana Sheremetyeva
Proceedings of the 16th Joint ACL-ISO Workshop on Interoperable Semantic Annotation

In this paper we focus on creation of interoperable annotation resources that make up a significant proportion of an on-going project on the development of conceptually annotated multilingual corpora for the domain of terrorist attacks in three languages (English, French and Russian) that can be used for comparative linguistic research, intelligent content and trend analysis, summarization, machine translation, etc. Conceptual annotation is understood as a type of task-oriented domain-specific semantic annotation. The annotation process in our project relies on ontological analysis. The paper details on the issues of the development of both static and dynamic resources such as a universal conceptual annotation scheme, multilingual domain ontology and multipurpose annotation platform with flexible settings, which can be used for the automation of the conceptual resource acquisition and of the annotation process, as well as for the documentation of the annotated corpora specificities. The resources constructed in the course of the research are also to be used for developing concept disambiguation metrics by means of qualitative and quantitative analysis of the golden portion of the conceptually annotated multilingual corpora and of the annotation platform linguistic knowledge.

In this paper, we present a methodology for the development of interactive domain-tuned patent tools for generating patent claims in English from non-English interfaces. The methodology is based on a merger of an interactive English-to-English patent claim generator, AutoPat1 and any external MT engine that might be appropriate for a certain language. The translation procedure is reduced to translation words and phrases rather than a complex claim sentence. The approach has been successfully used in The J-E patent system 2 , a patent claim generator in English from a Japanese-only interface, and in Dan-Pat3, a similar tool for the Danish-English pair of languages. The two systems use different MT engines but feature similar overall architecture. The methodology is portable to other languages and MT engines.

pdf bib abs
“Less, Easier and Quicker” in Language Acquisition for Patent MT
Svetlana Sheremetyeva
Workshop on patent translation

The paper describes some ways to save on knowledge acquisition when developing MT systems for patents by reducing the size of resources to be acquired, and creating intelligent software for knowledge handling and access speed. The approach is illustrated by knowledge acquisition and maintenance in the APTrans system for translating patent claims. Domain tuned resources are based on contrastive studies of multilingual patent documents and are handled by an electronic dictionary with a powerful user-friendly environment for acquisition, editing, browsing, defaulting and coherence proofing.

2004

pdf bib
A Flexible Language Acquisition Tool Kit for Natural Language Processing
Svetlana Sheremetyeva
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Application Adaptive Electronic Dictionary with Intelligent Interface
Svetlana Sheremetyeva
Proceedings of the Workshop on Enhancing and Using Electronic Dictionaries

2003

pdf bib
Natural Language Analysis of Patent Claims
Svetlana Sheremetyeva
Proceedings of the ACL-2003 Workshop on Patent Corpus Processing

2002

pdf bib
An MT learning environment for computational linguistics students
Svetlana Sheremetyeva
Proceedings of the 6th EAMT Workshop: Teaching Machine Translation

2000

pdf bib
Acquisition of a Language Computational Model for NLP
Svetlana Sheremetyeva | Sergei Nirenburg
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

pdf bib
Towards A Universal Tool For NLP Resource Acquisition
Svetlana Sheremetyeva | Sergei Nirenburg
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

1999

pdf bib abs
Interactive MT as support for non-native language authoring
Svetlana Sheremetyeva | Sergei Nirenburg
Proceedings of Machine Translation Summit VII

The paper describes an approach to developing an interactive MT system for translating technical texts on the example of translating patent claims between Russian and English. The approach conforms to the human-aided machine translation paradigm. The system is meant for a source language (SL) speaker who does not know the target language (TL). It consists of i) an analysis module which includes a submodule of interactive syntactic analysis of SL text and a submodule of fully automated morphological analysis, ii) an automatic module for transferring the lexical and partially syntactic content of SL text into a similar content of the TL text and iii) a fully automated TL text generation module which relies on knowledge about the legal format of TL patent claims. An interactive analysis module guides the user through a sequence of SL analysis procedures, as a result of which the system produces a set of internal knowledge structures which serve as input to the TL text generation. Both analysis and generation rely heavily on the analysis of the sublanguage of patent claims. The model has been developed for English and Russian as both SLs and TLs but is readily extensible to other languages.