Advanced Formal Language Theory, Spring 2022

ETH Zürich: Course catalog

Course Description

This course serves as an introduction to weighted formal language theory. The lectures cover the theory and algorithms used to manipulate and compute with weighted automata and grammars in detail. The emphasis is on rigor and depth rather than broad coverage. To motivate the theory, the course will also cover various applications of formal language theory to modern-day machine learning. Students should expect a healthy dose of proof-writing and, thus, mathematical maturity is expected. In terms of background, the class will draw on techniques from discrete math, analysis, and linear algebra. While there are no hard prerequisites, having taken a class that covers basic graph algorithms will be helpful as well as familiarity with basic real analysis and linear algebra.


The course is structured around a companion software library, called rayuela. Most of the homework exercises, which comprise 50% of the course grade, will involve implementing bits of the theory discussed in class while providing additional analysis or devising algorithms not discussed in class with the tools introduced during the lecture. The homework will be released every week in bunches of 3–4 questions but will be submitted jointly on two occasions.

The remaining 50% of the course grade will be determined by a final project of the student’s selection. The teaching staff has compiled a list of recent papers, listed below, whose replication would make for a great project, but students should feel free to come up with other ideas as well. To give early feedback, a proposal is due midway through the course.

In short, the grade for the course will be determined by the following formula:
* 50% Homeworks
* 50% Course Project
* No exam!


Lectures: Thu 12-14 (ML F 39). The lectures will be given in person. This recurring Zoom meeting will be used throughout the semester for people who want to tune in remotely. The password can be found on Moodle. Recordings can be found on the password-protected Polybox. The password is the same as the one for the Zoom meeting.

Discussion Sections: Thu 14-15 (ML F 38). Discussion sections this semester will mostly be in the form of answering questions you might have about the homeworks and the course content. They will either be in person or via Zoom (same link as lecture), depending on the preference of the teaching team.

Course Live Chat: To provide an easier way to communicate with the teaching team and your colleagues, we have set up a chat server here. It is hosted on ETH servers and it will be the main communication hub for the course! We encourage you to sign up and participate in the discussions there.

Course notes will be updated throughout the semester.
A selection of related work is provided for individual lectures.


13.02   Class website is online!
15.02   A selection of possible project papers has been published.
08.03   The Course Project Guidelines have been published.


Week Date   Topic Notes   Lecture Material Additional Reading Assigment Released Deadlines Exercises Discussion Section
1 24.02.22 Weighted Finite-State Acceptors Chapter 1 Introductory Slides Kuich (2003)
Mohri (2009; §6.2)
Allauzen, Mohri, Roark (2003)
2 03.03.22 The Algebraic Path Problem Chapter 2 Conway (1971)
Lehmann (1977)
Mohri (2009; §3)
Cormen et al. (2009; §22.3)
Assignment 1
3 10.03.22 The Algebraic Path Problem (continued), Finite-State Transducers Chapter 3 Mohri (1997)
Mohri et al. (2002)
Assignment 2
4 17.03.22 Determinization of Finite-State Machines Allauzen and Mohri (2003)
Assignment 3
5 24.03.22 Weighted Determinization, Weight Pushing Chapter 4 Mohri (2009; §7.2)
Assignment 4
6 31.03.22 Weighted Minimization Revuz (1992)
Watson (1994)
Moore (1958)
Hopcroft (1971)
Choffrut (2003)
Mohri (2009; §7.3)
Eisner (2003)
Baum et al. (1970)
Rabiner (1989)
Lafferty et al. (2001)
Wallach (2004)
Sarawagi and Cohen (2004)
Assignment 5
7 07.04.22 Superior Semirings and Regular Expressions Chapter 5 Wikipedia
Dijkstra (1959)
Huang (2008)
Johnson (1977)
Goodman (1999)
Huang and Chiang (2005)
Eisner (2002)
Li and Eisner (2009)
Eisner (2016)
8 14.04.22 Context-Free Grammars Chapter 6 Nederhof and Satta (2008)
9 21.04.22 No class (Week after Easter)
10 28.04.22 Grammar Transformations Chapter 7 Eisner and Blatz (2006; §6.3)
Stolcke (1995; §4.7)
Eisner and Blatz (2006; §6.5)
Rosenkranz and Lewis (1970)
Johnson (1996)
Manning and Carptener (1997)
Johnson and Roark (2000)
Schuler (2009)
11 05.05.22 Context-free parsing Chapter 8 Kasami (1966)
Younger (1967)
Cocke and Schwartz (1970)
Knuth (1977)
Bar-Hillel et al. (1961)
Cohen et al. (2011)
Goldberg and Elhadad (2011)
Earley (1970)
Stolcke (1995)
Assignment 6 Assignments 1-2 (10% of course grade)
12 12.05.22 Bilexical Grammars and Dependency Parsing Kübler et al. (2009)
Gaifman (1965)
Kuhlmann (2011)
Collins (1996)
Collins (1997)
Collins (1999)
Bikel (2004)
Eisner and Satta (1999)
Eisner (2000)
Assignment 7
13 19.05.22 Weighted Pushdown Automata Abney et al. (1999)
Lang (1974)
Assignment 8 Assignments 3-5 (15% of course grade) [19.05] Course Project Proposal [21.05]
14 26.05.22 Manipulating WPDAs Nivre (2004)
Nivre (2008)
Greichbach (1965)
Moore (2000)
Nederhof and Satta (2004)
Assignment 9
15 02.06.22 Weighted Tree Automata Chapter 12 Kozen (1997)
Comon et al. (2008)
Engelfriet (1975)
Thatcher (1967)
Büchse et al. (2010)
Maletti and Quernheim (2011)
Knight and Graehl (2005)
Maletti (2008)
Hanneforth, Maletti, Quernheim (2018)
Kallmeyer (2010)
Kuhlmann, Satta (2014)
Assignment 10

Course Project

You can find more information on the course project and the detailed instructions for it here.

Course Project Ideas

The teaching staff has compiled a list of recent papers, whose replication would make for a great project. However, you should feel free to come up with other ideas as well.

Here is the list we have compiled:

Paper ID Paper
1 Learning Context-free Languages with Nondeterministic Stack RNNs, DuSell B., Chiang, D. (2020)
2 Learning Hierarchical Structures with Differentiable Nondeterministic Stacks, DuSell B., Chiang, D. (2020)
3 Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks, Shen et al. (2019)
4 Consistent Unsupervised Estimators for Anchored PCFGs, Clark, A., Fijalkow, N. (2020)
5 Factor Graph Grammars, Chiang, D., Riley, D. (2020)
6 Composing Finite State Transducers on GPUs, Argueta, A, Chiang, D. (2018)
7 Quasi-Synchronous Grammars: Alignment by Soft Projection of Syntactic Dependencies, Smith, D., Eisner, J. (2006)
8 Efficient Normal-Form Parsing for Combinatory Categorial Grammar, Eisner, J. (1996)
9 A Faster Parsing Algorithm for Lexicalized Tree-Adjoining Grammars, Eisner, J., Satta, G. (2000)
10 Dependency Parsing by Belief Propagation, Smith, D., Eisner, J. (2008)
11 A New Parsing Algorithm for Combinatory Categorial Grammar, Kuhlmann, M., Satta, G. (2014)
12 The Equivalence Of Four Extensions Of Context-Free Grammars, Vijay-Shanker, K., Weir, D. (1995)
13 A geometric hierarchy beyond context-free languages, Weir, D. (1992)
14 Rational Recurrences, Peng, H., Schwartz, R., Thomson, S., Smith, N. A., 2018
15 SoPa: Bridging CNNs, RNNs, and Weighted Finite-State Machines, H., Schwartz, R., Thomson, S., Smith, N. A., 2018
16 On the Computational Power of Neural Nets, Siegelmann H.T., Sontag E. D., 1995
17 A General-Purpose Algorithm for Constrained Sequential Inference, Deutsch, D., Upadhyay, S., Roth, D., 2019
18 Theoretical Limitations of Self-Attention in Neural Sequence Models, Hahn, M., 2020
19 Overcoming a Theoretical Limitation of Self-Attention, Chiang, D., Cholak, P., 2022
20 RNNs can generate bounded hierarchical languages with optimal memory, Hewitt, J., Hahn, M., Ganguli, S., Liang, P., Manning, C. D., 2020


You can ask questions on Moodle or on the course chat server. Please post questions there, so others can see them and join the discussion. If you have questions which are not of general interest, please don’t hesitate to contact us directly, i.e., email Ryan with the TAs cc-ed.