An analysis of translation divergence patterns using PanLex translation pairs
Abstract
This analysis was performed to understand the patterns of translation divergences occurring in high and low frequency verbs, and to test the hypothesis that high frequency verbs are more prone to translation divergences than low frequency ones. Four types of divergences were considered: Thematic, Conflational, Categorial, and Structural (Dorr, 1990), with samples from three language pairs: Italian to French, Italian to English and English to Thai. The analysis is also an evaluation of the possibility of using the online multilingual dictionary PanLex (Baldwin et al., 2010) to automatically derive transfer rules, as part of a larger effort to create a machine translation system based on customizable language-specific grammars for both source and target languages, using semantic representations in the format of Minimal Recursion Semantics, or MRS, (Copestake et al. 2005) as the input and output of the transfer stage. Based on the samples analyzed, this evaluation suggests that manual transfer rules creation and tweaking of automatic rules would be most needed for high frequency verbs, while low frequency verbs seem likely to have a lower translation divergence error rate.
Collections
- Linguistics [135]