A Rich Morphological Tagger for English: Exploring the Cross-Linguistic Tradeoff Between Morphology and Syntax

Christo Kirov, John Sylak-Glassman, Rebecca Knowles, Ryan Cotterell, Matt Post

April 2017

Abstract

A traditional claim in linguistics is that all human languages are equally expressive—able to convey the same wide range of meanings. Morphologically rich languages, such as Czech, rely on overt inflectional and derivational morphology to convey many semantic distinctions. Languages with comparatively limited morphology, such as English, should be able to accomplish the same using a combination of syntactic and contextual cues. We capitalize on this idea by training a tagger for English that uses syntactic features obtained by automatic parsing to recover complex morphological tags projected from Czech. The high accuracy of the resulting model provides quantitative confirmation of the underlying linguistic hypothesis of equal expressivity, and bodes well for future improvements in downstream HLT tasks including machine translation.

Type

Conference paper

Publication

Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics

Add the full text or supplementary notes for the publication here using Markdown formatting.