Information Theory in Linguistics: Methods and Applications

ESSLLI 2021: Week 2 (August 2-6)

Course Description

Since Shannon originally proposed his mathematical theory of communication in the middle of the 20th century, information theory has been an important way of viewing and investigating problems at the interfaces between linguistics, cognitive science, and computation, respectively. With the upsurgence in applying machine learning approaches to linguistics questions, information-theoretic methods are becoming an ever more important tool in the linguist’s toolbox. The course emphasizes interdisciplinary connections between the fields of linguistics and natural language processing. We plan to do this by first establishing a firm mathematical basis, and showing it can be fruitfully applied to several linguistic applications, ranging from semantics, typology, morphology, and phonotactics, to the interface between cognitive science and linguistics.


Syllabus

Lecture 1 Introduction and Overview Slides
Lecture 2 Estimating Information-Theoretic Quantities Slides iPython Notebook
Lecture 3 Case Studies in Complexity Slides
Lecture 4 Case Studies in Correlation Slides
Lecture 5 Case Studies in Communication Slides


Literature

Information Theory Background

Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. 2006. Wiley-Interscience, USA.

Statistics Background

Peter J. Bickel and Kjell A. Doksum. Mathematical Statistics. 2001. Prentice Hall, USA.

Recent Papers (by topic)

Topic Title Authors Bib  
Entropy Estimation Estimating Discrete Entropy Part 1 Sebastian Nowozin
Estimating Discrete Entropy Part 2 Sebastian Nowozin
Estimating Discrete Entropy Part 3 Sebastian Nowozin
Jackknifing An Index of Diversity Samuel Zahl
Estimating functions of probability distributions from a finite set of samples David H. Wolpert and David R. Wolf
Distribution of Mutual Information Hutter, Marcus
Entropy and Inference, Revisited Nemenman, Ilya and Shafee, F. and Bialek, William
Estimation of Entropy and Mutual Information Paninski, Liam
Bayesian Entropy Estimation for Countable Discrete Distributions Evan Archer and Il Memming Park and Jonathan W. Pillow
Arbitrariness of the Sign Meaning to Form: Measuring Systematicity as Information Pimentel, Tiago and McCarthy, Arya D. and Blasi, Damian and Roark, Brian and Cotterell, Ryan
Finding Concept-specific Biases in Form--Meaning Associations Pimentel, Tiago and Roark, Brian and Wichmann, Søren and Cotterell, Ryan and Blasi, Damián
Morphology Predicting Declension Class from Form and Meaning Williams, Adina and Pimentel, Tiago and Blix, Hagen and McCarthy, Arya D. and Chodroff, Eleanor and Cotterell, Ryan
Quantifying the Semantic Core of Gender Systems Williams, Adina and Blasi, Damián and Wolf-Sonkin, Lawrence and Wallach, Hanna and Cotterell, Ryan
Measuring the Similarity of Grammatical Gender Systems by Comparing Partitions McCarthy, Arya D. and Williams, Adina and Liu, Shijia and Yarowsky, David and Cotterell, Ryan
Morphological Irregularity Correlates with Frequency Wu, Shijie and Cotterell, Ryan and O'Donnell, Timothy
On the Complexity and Typology of Inflectional Morphological Systems Cotterell, Ryan and Kirov, Christo and Hulden, Mans and Eisner, Jason
Human Language Processing Predictive power of word surprisal for reading times is a linear function of language model quality Goodkind, Adam and Bicknell, Klinton
Evaluating information-theoretic measures of word prediction in naturalistic sentence reading Aurnhammer, Christoph and Frank, Stefan L
A Cognitive Regularizer for Language Modeling Wei, Jason and Meister, Clara and Cotterell, Ryan
Lower Perplexity is Not Always Human-Like Kuribayashi, Tatsuki and Oseki, Yohei and Ito, Takumi and Yoshida, Ryo and Asahara, Masayuki and Inui, Kentaro
Human Sentence Processing: Recurrence or Attention? Merkx, Danny and Frank, Stefan L.
Lossy-context surprisal: An information-theoretic model of memory effects in sentence processing Futrell, Richard and Gibson, Edward and Levy, Roger P
Lexicon The Psycho-biology of Language Zipf, G. K
Word lengths are optimized for efficient communication Piantadosi, Steven T. and Tily, Harry and Gibson, Edward
Info/information theory: speakers choose shorter words in predictive contexts Kyle Mahowald and Evelina Fedorenko and Steven T. Piantadosi and Edward Gibson
The Entropy of Words—Learnability and Expressivity across More than 1000 Languages Bentz, Christian and Alikaniotis, Dimitrios and Cysouw, Michael and Ferrer-i-Cancho, Ramon
How (Non-)Optimal is the Lexicon? Pimentel, Tiago and Nikkarinen, Irene and Mahowald, Kyle and Cotterell, Ryan and Blasi, Damián
Disambiguatory Signals are Stronger in Word-initial Positions Pimentel, Tiago and Cotterell, Ryan and Roark, Brian
Speakers Fill Lexical Semantic Gaps with Context Pimentel, Tiago and Hall Maudslay, Rowan and Blasi, Damián and Cotterell, Ryan
Language Generation If Beam Search is the Answer, What was the Question? Meister, Clara and Vieira, Tim and Cotterell, Ryan
Language Model Evaluation Beyond Perplexity Meister, Clara and Cotterell, Ryan
Parsing Mathematics as a Science of Patterns Michael D. Resnik
Syntactic dependencies correspond to word pairs with high mutual information Futrell, Richard and Qian, Peng and Gibson, Edward and Fedorenko, Evelina and Blank, Idan
Color Systems Efficient compression in color naming and its evolution Zaslavsky, Noga and Kemp, Charles and Regier, Terry and Tishby, Naftali
Color naming across languages reflects color use Gibson, Edward and Futrell, Richard and Jara-Ettinger, Julian and Mahowald, Kyle and Bergen, Leon and Ratnasingam, Sivalogeswaran and Gibson, Mitchell and Piantadosi, Steven T. and Conway, Bevil R.
Communicating artificial neural networks develop efficient color-naming systems Chaabouni, Rahma and Kharitonov, Eugene and Dupoux, Emmanuel and Baroni, Marco
Interpretability of Neural Networks Information-Theoretic Probing for Linguistic Structure Pimentel, Tiago and Valvoda, Josef and Hall Maudslay, Rowan and Zmigrod, Ran and Williams, Adina and Cotterell, Ryan
Information-Theoretic Probing with Minimum Description Length Voita, Elena and Titov, Ivan