Incremental Learning for Fully Unsupervised Word Segmentation Using Penalized Likelihood and Model Selection

07/20/2016

∙

We present a novel incremental learning approach for unsupervised word segmentation that combines features from probabilistic modeling and model selection. This includes super-additive penalties for addressing the cognitive burden imposed by long word formation, and new model selection criteria based on higher-order generative assumptions. Our approach is fully unsupervised; it relies on a small number of parameters that permits flexible modeling and a mechanism that automatically learns parameters from the data. Through experimentation, we show that this intricate design has led to top-tier performance in both phonemic and orthographic word segmentation.

READ FULL TEXT

Incremental Learning for Fully Unsupervised Word Segmentation Using Penalized Likelihood and Model Selection

Sign in with Google

Consider DeepAI Pro