Publications

A Close Analysis of the Subset Construction

In this paper we examine the difficulty of finding an equivalent deterministic automaton when confronted with a non-deterministic one. While for some …

Ivan Baburin, Ryan Cotterell

Conference on Foundations of Software Technology and Theoretical Computer Science December 2025

URL

Investigating Critical Period Effects in Language Acquisition through Neural Language Models

Humans appear to have a critical period (CP) for language acquisition: Second language (L2) acquisition becomes harder after early childhood, and …

Ionut Constantinescu, Tiago Pimentel, Ryan Cotterell, Alex Warstadt

Transactions of the Association for Computational Linguistics September 2025

URL

A Distributional Perspective on Word Learning in Neural Language Models

Filippo Ficarra, Ryan Cotterell, Alex Warstadt

January 2025

A Practical Method for Generating String Counterfactuals

Matan Avitan, Ryan Cotterell, Yoav Goldberg, Shauli Ravfogel

January 2025

A Spatio-Temporal Point Process for Fine-Grained Modeling of Reading Behavior

Francesco Ignazio Re, Andreas Opedal, Glib Manaiev, Mario Giulianelli, Ryan Cotterell

January 2025

Bigger is not always better: The importance of human-scale language modeling for psycholinguistics

Ethan Wilcox, Michael Y. Hu, Aaron Mueller, Alex Warstadt, Leshen Choshen, Chengxu Zhuang, Adina Williams, Ryan Cotterell, Tal Linzen

January 2025

Can Language Models Learn Typologically Implausible Languages?

Tianyang Xu, Tatsuki Kuribayashi, Yohei Oseki, Ryan Cotterell, Alex Warstadt

January 2025

URL

Controllable Context Sensitivity and the Knob Behind It

Julian Minder*, Kevin Du*, Niklas Stoehr, Giovanni Monea, Chris Wendler, Robert West, Ryan Cotterell

January 2025

Gumbel Counterfactual Generation from Language Models

Shauli Ravfogel, Anej Svete, Vésteinn Snæbjarnarson, Ryan Cotterell

January 2025

Incremental Alternative Sampling as a Lens into the Temporal and Representational Resolution of Linguistic Prediction

This study presents a new model of processing difficulty rooted in resource allocation theory, Incremental Alternative Sampling (IAS). Differential …

Mario Giulianelli, Sarenne Wallbridge, Ryan Cotterell, Raquel Fernández

PsyArXiv January 2025

URL

Information Locality as an Inductive Bias for Neural Language Models

Taiga Someya, Anej Svete, Brian DuSell, Timothy J. O'Donnell, Mario Giulianelli, Ryan Cotterell

January 2025

Language Models over Canonical Byte-Pair Encodings

Tim Vieira, Tianyu Liu, Clemente Pasti, Yahya Emara, Brian DuSell, Benjamin LeBrun, Mario Giulianelli, Juan Luis Gastaldi, John Terilla, Timothy J. O'Donnell, Ryan Cotterell

January 2025

On the challenges and opportunities in generative AI

Laura Manduchi, Kushagra Pandey, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt, Vincent Fortuin

arXiv January 2025

Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation

Tianyu Liu*, Jirui Qi*, Paul He, Arianna Bisazza, Mrinmaya Sachan, Ryan Cotterell

January 2025

Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo

João Loula, Benjamin LeBrun, Li Du, Ben Lipkin, Clemente Pasti, Gabriel Grand, Tianyu Liu, Yahya Emara, Marjorie Freedman, Jason Eisner, Ryan Cotterell, Vikash Mansinghka, Alexander K. Lew, Tim Vieira, Timothy J. O'Donnell

January 2025

Syntactic Control of Language Models by Posterior Inference

Vicky Xefteri, Afra Amini, Tim Vieira, Ryan Cotterell

January 2025

The Foundations of Tokenization: Statistical and Computational Concerns

Juan Luis Gastaldi, John Terilla, Luca Malagutti, Brian DuSell, Tim Vieira, Ryan Cotterell

January 2025

The Harmonic Structure of Information Contours

Eleftheria Tsipidi, Samuel Kiegeland, Franz Nowak, Tianyang Xu, Ethan Wilcox, Alex Warstadt, Ryan Cotterell, Mario Giulianelli

January 2025

Training Neural Networks as Recognizers of Formal Languages

Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, Brian DuSell

January 2025

URL

Unique Hard Attention: A Tale of Two Sides

Selim Jerad, Anej Svete, Jiaoda Li, Ryan Cotterell

January 2025

Variational Best-of-$N$ Alignment

Best-of-N (BoN) is a popular and effective algorithm for aligning language models to human preferences. The algorithm works as follows: at inference …

Afra Amini, Tim Vieira, Elliott Ash, Ryan Cotterell

January 2025

URL

On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning

The performance of modern language models (LMs) has been improved by chain-of-thought (CoT) reasoning, i.e., the process of generating intermediate …

Franz Nowak$^*$, Anej Svete$^*$, Alexandra Butoi, Ryan Cotterell

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) January 2024

URL

Towards Explainability in Legal Outcome Prediction Models

Current legal outcome prediction models - a staple of legal NLP - do not explain their reasoning. However, to employ these models in the real world, …

Josef Valvoda, Ryan Cotterell

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) January 2024

URL

What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages

What can large language models learn? By definition, language models (LM) are distributions over strings. Therefore, an intuitive way of addressing …

Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) January 2024

URL

Teaching

SEE ALL CLASSES

Advanced Formal Language Theory

ETH Zürich Spring 2025 This course explores the connection between automata and formal logic. More precisely, it covers the algebraic characterization of the regular languages definable in many different logical theories, the complexity theory of boolean circuits, and the connection between the two.

Philosophy of Language and Computation II

ETH Zürich Spring 2025 This graduate class, partly taught like a seminar, is designed to help you understand the philosophical underpinnings of modern work in natural language processing (NLP), most of which is centered around statistical machine learning applied to natural language data.

Understanding Context-Free Parsing Algorithms

ETH Zürich Spring 2025 In the first part of the seminar, we study some of the most popular parsing algorithms, which are a fundamental tool both in natural language processing and in programming languages. Each week, a student will present a paper on parsing, including the papers that first described celebrated parsing algorithms like Earley’s and CKY. We will also put a lot of focus on weighted parsing, which is fundamental in applications to language modeling. In the second part, we’ll examine advanced NLP topics through analysis of pivotal (and often controversial) papers that are shaping the field’s future direction.

Natural Language Processing

ETH Zürich Fall 2024 This course presents topics in natural language processing with an emphasis on modern techniques, primarily focusing on statistical and deep learning approaches. The course provides an overview of the primary areas of research in language processing as well as a detailed exploration of the models and techniques used both in research and in commercial natural language systems.

Neural Networks and Computational Complexity

ETH Zürich Fall 2024 This Bachelor’s seminar delves into the fascinating world of modern large language models (LLMs), which have revolutionized natural language processing. As these models continue to evolve and impact various domains, we will explore their potential, limitations, and underlying mechanisms through a theroretical lens. Throughout the seminar, we will address the following key questions: what are the real capabilities of large language models? What are their inherent limitations? How do these models function at a fundamental level? Under what circumstances are they likely to fail? Can we develop a comprehensive “science of LLMs” to address these inquiries? We will leverage formal language theory to provide a rigorous framework for understanding the representational capacity of neural language models.

NLP in the Wild

ETH Zürich Spring 2024 In recent years, NLP has become a part of our daily lives. Many of us use tools like Google Translate to understand sentences in languages we don’t know, and chatbots like ChatGPT to help draft essays and answer basic questions. However, even though most people recognize the utility of such tools, there are still many questions to be answered about their reliability and their impact on society. For example, to what extent can we or should we trust what ChatGPT says? Should chatbots ever be used in legal decision-making? What is the role that NLP should play in the education system? In this open-ended seminar, we will read and discuss opinions on the proper use of NLP in the real world, or as we term it, NLP in the wild!

Large Language Models

ETH Zürich Spring 2025 Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. In this course, we start with the probabilistic foundations of language models, i.e., covering what constitutes a language model from a formal, theoretical perspective. We then discuss how to construct and curate training corpora, and introduce many of the neural-network architectures often used to instantiate language models at scale. The course discusses privacy and harms, as well as applications of language models in NLP and beyond.

Philosophy of Language and Computation I

ETH Zürich Spring 2024 This graduate class, partly taught like a seminar, is designed to help you understand the philosophical underpinnings of modern work in natural language processing (NLP), most of which centered around statistical machine learning applied to natural language data.

Generating Text from Language Models

ACL (Toronto) July 2023 In this tutorial, we will provide a centralized and cohesive discussion of critical considerations when choosing how to generate text from a language model. We will cover a wide range of empirically-observed problems (like degradation, hallucination, repetition) and their corresponding proposed algorithmic solutions from recent research (like top-p sampling and its successors). We will then cover methods in controlled generation, that go beyond just ensuring coherence to ensure text exhibits specific desired properties.

Formal Language Theory and Neural Networks

ESSLLI (Ljubljana, Slovenia) Spring 2023

Thesis Projects

If you are a BSc or MSc student at ETH Zurich interested in writing your thesis with us, we would be delighted to hear from you! Unfortunately, we do not have the capacity to consider students from outside ETH for thesis projects. To obtain a better understanding of what currently interests us, we invite you to check our most recent publications. However, feel free to express interest in any topic you think our group might be well suited to advise you on.

Specifcally for Bachelor theses or semester projects, we typically assign you one of our published papers to replicate, so it would be ideal if you indicate 3-4 of our publications that you are interested in.

Please send an email to ryan.cotterell@inf.ethz.ch with CC to afra.amini@inf.ethz.ch, anej.svete@inf.ethz.ch, and eleftheria.tsipidi@inf.ethz.ch. State either [bachelor’s thesis] or [master’s thesis] at the start of the subject. For us to get to know you a little, please write a paragraph introducing your interests and attach your CV as well as your transcript of grades. It helps us a lot to with finding a matching project if you are able to state more concrete topics that you would like to work on. We are looking forward to receiving your inquiry!

Current Foci

People

Senior Members

Postdoc

Postdoc

Postdoc

Postdoc

Postdoc

Administrative Assistant

Assistant Professor of Computer Science

Research Consultant

PhD Students

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

PhD Student

Alumna

Postdoc

Postdoc

PhD Student

Postdoc

PhD Student

PhD Student

Postdoc

PhD Student

Amazon Web Services (AWS)

PhD Student

JP Morgan Chase

PhD Student

Publications

Teaching

Thesis Projects

Joining Our Lab

Contact us