William Bryce


2016

pdf bib
Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input
Cory Shain | William Bryce | Lifeng Jin | Victoria Krakovna | Finale Doshi-Velez | Timothy Miller | William Schuler | Lane Schwartz
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

This paper presents a new memory-bounded left-corner parsing model for unsupervised raw-text syntax induction, using unsupervised hierarchical hidden Markov models (UHHMM). We deploy this algorithm to shed light on the extent to which human language learners can discover hierarchical syntax through distributional statistics alone, by modeling two widely-accepted features of human language acquisition and sentence processing that have not been simultaneously modeled by any existing grammar induction algorithm: (1) a left-corner parsing strategy and (2) limited working memory capacity. To model realistic input to human language learners, we evaluate our system on a corpus of child-directed speech rather than typical newswire corpora. Results beat or closely match those of three competing systems.