Advanced Formal Language Theory, Spring 2023
ETH Zürich: Course catalog
Course Description
This course serves as an introduction to weighted formal language theory. The lectures cover the theory and algorithms used to manipulate and compute with weighted automata and grammars in detail. The emphasis is on rigor and depth rather than broad coverage. To motivate the theory, the course will also cover various applications of formal language theory to modern-day machine learning. Students should expect a healthy dose of proof-writing and, thus, mathematical maturity is expected. In terms of background, the class will draw on techniques from discrete math, analysis, and linear algebra. While there are no hard prerequisites, having taken a class that covers basic graph algorithms will be helpful as well as familiarity with basic real analysis and linear algebra.
The course is structured around a companion software library, called rayuela
.
Most of the homework exercises, which comprise (a part of) the course grade, will involve implementing bits of the theory discussed in class while providing additional analysis or devising algorithms not discussed in class with the tools introduced during the lecture.
The homework will be released throughout the semester in assignments with 3–4 questions but will be submitted jointly on two occasions.
News
3.1. Class website is online!
9.3. The iPad class notes have been posted. The same link will contain updated notes for the first part of the course throughout the semester.
Syllabus and Schedule
On the Use of Class Time
Lectures
There are two lecture slots for AFLT each week: Wednesdays 16-18 (HG D 5.2) and Thursdays 12-14 (ML F 39). Both lectures will be given in person and live broadcast on Zoom; the password is available on the course Moodle page.
Lectures will be recorded—links to the Zoom recordings will be posted on the course Moodle page.
We also regularly publish Ryan’s iPad class notes.
Discussion Sections
There will be no organized tutorial sessions for AFLT. However, the teaching staff will be available for questions on the course chat channels (see below) and, in case you have a question that requires a more in-depth discussion, you can schedule office hours with the teaching team by reaching out to us via the chat.
Syllabus
Date | Module | Topic | Notes | Lecture Material | Additional Reading |
---|---|---|---|---|---|
22. 2. 2023 | Introduction and Overview | Introductory Slides | |||
22. 2. 2023 | Regular Languages | Introduction and Overview, Weighted Formal Languages and Semiring Theory | Regular Languages | ||
23. 2. 2023 | Regular Languages, Weighted Finite-State Acceptors | ||||
1. 3. 2023 | Closed Semirings, The Algebraic Path Problem | ||||
2. 3. 2023 | The Algebraic Path Problem (cont.) | ||||
8. 3. 2023 | Homomorphisms, Weighted Finite-State Transducers | ||||
9. 3. 2023 | Nullary Removal, Determinization of Weighted Finite-state Automata | ||||
15. 3. 2023 | Determinization of Weighted Finite-state Automata (cont.) | ||||
16. 3. 2023 | Weight Pushing | ||||
22. 3. 2023 | Minimization of Weighted Finite-state Automata | ||||
23. 3. 2023 | Superior Semirings, Dijkstra's Algorithm, Johnson's Algorithm | ||||
29. 3. 2023 | Weighted Regular Expressions | ||||
30. 3. 2023 | Context-Free Languages | Weighted Context-Free Grammars | Context-free Languages | ||
5. 4. 2023 | Newton's Algorithm and Nullary Removal | ||||
6. 4. 2023 | Unary Removal, Binarization, Chomsky Normal Form, CKY | ||||
Easter Break | |||||
19. 4. 2023 | Context-Free Languages | Earley's Algorithm | |||
20. 4. 2023 | Weighted Pushdown Automata | ||||
26. 4. 2023 | Newton's Algorithm for PDAs, Nullary Removal, Unary Removal | ||||
27. 4. 2023 | Dependency Languages | Chomsky Normal Form for PDAs, Lang's Algorithm | |||
3. 5. 2023 | Bilexical Grammars and Dependency Parsing (cont.) | ||||
4. 5. 2023 | Regular Tree Languages | Weighted Tree Automata | Regular Tree Languages | ||
10. 5. 2023 | Determinization of Weighted Tree Automata | ||||
11. 5. 2023 | Determinization of Weighted Tree Automata (cont.), Weight Pushing | ||||
17. 5. 2023 | Minimization of weighted Tree Automata | ||||
18. 5. 2023 | Tree-Adjoining Languages | Weighted Tree-adjoining Grammars, Lang Normal Form | |||
24. 5. 2023 | Other Formalisms for Tree-adjoining Languages | ||||
25. 5. 2023 | Parsing Tree-adjoining Languages | ||||
31. 5. 2023 | The Weir Hierarchy | ||||
1. 6. 2023 | Context-Free Tree Languages | Context-free Tree Languages, Context-free Tree Grammars |
Organisation
Live Chat
In addition to class time, there will also be a RocketChat
-based live chat hosted on ETH’s servers.
Students are free to ask questions of the teaching team and of others in public or private (direct message).
There are specific channels for each of the two assignments as well as for reporting errata in the course notes and slides.
All data from the chat will be deleted from ETH servers at the course’s conclusion.
Important: There are a few important points you should keep in mind about the course live chat:
RocketChat
will be the main communications hub for the course. You are responsible for receiving all messages broadcast in theRocketChat
.- Your username should be
firstname.lastname
. This is required as we will only allow enrolled students to participate in the chat and we will remove users which we cannot validate. - Tag your questions as described in the document on How to use Rycolab Course RocketChat channels. The document also contains other general remarks about the use of
RocketChat
. - Search for answers in the appropriate channels before posting a new question.
- Ask questions on public channels as much as possible.
- Answer to posts in threads.
- The chat supports
LaTeX
for easier discussion of technical material. See How to useLaTeX
inRocketChat
. - We highly recommend you download the desktop app here.
This is the link to the main channel.
To make the moderation of the chat more easily manageable, we have created a number of other channels on RocketChat
.
The full list is:
- AFLT General Channel for the general organisational discussions.
- AFLT Announcements Channel for the announcements by the teaching team.
- AFLT Content Questions for your questions about the content of the course.
Important: Please prepend your question with a “tag” about the content of your question in square brackets.
For example, if your question is about the content of Lecture 2 and specifically about the definition of a semiring, please start your message with
[Lecture #1, Definition of a Semiring]
. - AFLT Errata for reporting typos and errors in the course lecture notes and the slides.
- AFLT Assignment 1 for discussing and asking questions about Assignment 1.
- AFLT Assignment 2 for discussing and asking questions about Assignment 2.
- AFLT Assignment 3 for discussing and asking questions about Assignment 3.
- AFLT Assignment 4 for discussing and asking questions about Assignment 4.
- AFLT Assignment 5 for discussing and asking questions about Assignment 5.
- AFLT Assignment 6 for discussing and asking questions about Assignment 6.
- AFLT Assignment 7 for discussing and asking questions about Assignment 7.
- AFLT Assignment 8 for discussing and asking questions about Assignment 8.
- AFLT Assignment 9 for discussing and asking questions about Assignment 9.
- AFLT Assignment 10 for discussing and asking questions about Assignment 10.
- Find Assignment Partners for finding teammates for the course assignments.
If you feel like you would benefit from any other channel, feel free to suggest it to the teaching team!
Course Notes
We prepared very detailed course notes last year.
We will be improving them throughout the semester as we go!
The individual chapters will be published in the course syllabus below and updated throughout the semester.
Please report all errata to the teaching staff; we really want to polish the notes this semester so any feedback, no matter how small, would be very appreciated—we created an errata channel in RocketChat
.
Other useful literature:
Grading
There will be no final exam for this course.
Instead, we will release 10 course assignments throughout the semester and you can complete an optional course project. You can then obtain your final grade in two ways:
1. You can obtain your final grade from assignments only, in which case we will calculate your final grade based on the 8 highest-scoring assignments you turn in (you can therefore turn in just 8 of the assignments, or optionally fewer if you are not aiming for the highest grade).
In that case, the final grade will be calculated as follows:
- 100 % Assignments*
2. Alternatively, you can obtain your final grade from assignments and a course project. In that case, you only have to turn in 4 assignments and the course project to be able to achieve the highest grade (your 4 highest-scoring assignments will count towards your final grade). See below for more information on the course project.
In this case, your final grade will be calculated as follows:
- 50 % Assignments* and
- 50 % Course Project
You can cooperate on and discuss the assignments with your peers, but you must write your own code and write up the solutions yourself.
*Important: In either of the two options, there is a specific condition we impose on the assignments you turn in: half of your assignment have to come from the first “batch” of the assignments (Assignments 1—5), and half of your assignments have to come from the second “batch” of the assignments (Assignments 6—10). This means that, if you are turning in 4 assignments, you have to turn in 2 assignments from the first batch and 2 assignments from the second batch, and if you are submitting 8 assignments, you have to turn in 4 assignments from the first batch and 4 assignments from the second batch. This is to ensure you have a more holistic understanding of the entire course material.
We require the solutions to be properly typeset.
We recommend using LaTeX
(with Overleaf
), but markdown
files with MathJax
for the mathematical expressions are also fine.
Assignments
Below you can find the assignment instructions. Keep in mind that we published last year’s assignments to give you an idea of the difficulty level and the type of questions we will be asking. The assignments can change slightly as we update them this year, however, the questions themselves will not change—we will only update the assignments to make them more clear and to fix any typos.
Assignment instructions:
- Assignment 1
- Assignment 2
- Assignment 3
- Assignment 4
- Assignment 5
- Assignment 6
- Assignment 7
- Assignment 8
- Assignment 9
- Assignment 10
Important: The deadline for the projects and all assignments is shortly before the end of the Summer examination period—15. 8. 2023. The submissions will be done through the course Moodle page. Keep in mind that due to how late this deadline is, we cannot extend it—we are counting on you to be organised and submit the work in time.
Course Project
The course project is optional, but it is a great way to get a deeper understanding of the course material and to apply it to a real-world problem. We expect you to complete the project in a group of 3-5 students—to make supervising the projects manageable, we will not accept groups with fewer than 3 students.
You can find more information on the course project and the detailed instructions for it in the Course project instructions. To give early feedback, a proposal is due midway through the course (15.4.2023) and a progress report towards the end of the semester (15.5.2023). See the course Moodle page for the submission links.
The teaching staff has compiled a list of recent papers, listed below, whose replication would make for a great project, but students should feel free to come up with other ideas as well.
Contact
You can ask questions on the course chat server. Please post all content-related questions there, so others can see them and join the discussion. If you have questions which are not related to the course material and are not of general interest, please don’t hesitate to contact us directly, i.e., email Ryan with the TAs cc-ed.