Neural Morphological Analysis: Encoding-Decoding Canonical Segments

Abstract

Canonical morphological segmentation aims to divide words into a sequence of standardized segments. In this work, we propose a character-based neural encoderdecoder model for this task. Additionally, we extend our model to include morphemelevel and lexical information through a neural reranker. We set the new state of the art for the task improving previous results by up to 21% accuracy. Our experiments cover three languages: English, German and Indonesian.

Publication
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing