# Advanced Formal Language Theory, Spring 2023

ETH Zürich: Course catalog

## Course Description

This course serves as an introduction to weighted formal language theory. The lectures cover the theory and algorithms used to manipulate and compute with weighted automata and grammars in detail. The emphasis is on rigor and depth rather than broad coverage. To motivate the theory, the course will also cover various applications of formal language theory to modern-day machine learning. Students should expect a healthy dose of proof-writing and, thus, mathematical maturity is expected. In terms of background, the class will draw on techniques from discrete math, analysis, and linear algebra. While there are no hard prerequisites, having taken a class that covers basic graph algorithms will be helpful as well as familiarity with basic real analysis and linear algebra.

The course is structured around a companion software library, called `rayuela`

.
Most of the homework exercises, which comprise (a part of) the course grade, will involve implementing bits of the theory discussed in class while providing additional analysis or devising algorithms not discussed in class with the tools introduced during the lecture.
The homework will be released throughout the semester in assignments with 3–4 questions but will be submitted jointly on two occasions.

## News

**3.1.** Class website is online!

**9.3.** The iPad class notes have been posted. The same link will contain updated notes for the first part of the course throughout the semester.

## Syllabus and Schedule

### On the Use of Class Time

#### Lectures

There are two lecture slots for AFLT each week: Wednesdays 16-18 (HG D 5.2) and Thursdays 12-14 (ML F 39). Both lectures will be given in person and live broadcast on Zoom; the password is available on the course Moodle page.

Lectures will be recorded—links to the Zoom recordings will be posted on the course Moodle page.

We also regularly publish Ryan’s iPad class notes.

#### Discussion Sections

There will be no organized tutorial sessions for AFLT. However, the teaching staff will be available for questions on the course chat channels (see below) and, in case you have a question that requires a more in-depth discussion, you can schedule **office hours** with the teaching team by reaching out to us via the chat.

### Syllabus

Date | Module | Topic | Notes | Lecture Material | Additional Reading |
---|---|---|---|---|---|

22. 2. 2023 | Introduction and Overview | Introductory Slides | |||

22. 2. 2023 | Regular Languages |
Introduction and Overview, Weighted Formal Languages and Semiring Theory | Regular Languages | ||

23. 2. 2023 | Regular Languages, Weighted Finite-State Acceptors | ||||

1. 3. 2023 | Closed Semirings, The Algebraic Path Problem | ||||

2. 3. 2023 | The Algebraic Path Problem (cont.) | ||||

8. 3. 2023 | Homomorphisms, Weighted Finite-State Transducers | ||||

9. 3. 2023 | Nullary Removal, Determinization of Weighted Finite-state Automata | ||||

15. 3. 2023 | Determinization of Weighted Finite-state Automata (cont.) | ||||

16. 3. 2023 | Weight Pushing | ||||

22. 3. 2023 | Minimization of Weighted Finite-state Automata | ||||

23. 3. 2023 | Superior Semirings, Dijkstra's Algorithm, Johnson's Algorithm | ||||

29. 3. 2023 | Weighted Regular Expressions | ||||

30. 3. 2023 | Context-Free Languages |
Weighted Context-Free Grammars | Context-free Languages | ||

5. 4. 2023 | Newton's Algorithm and Nullary Removal | ||||

6. 4. 2023 | Unary Removal, Binarization, Chomsky Normal Form, CKY | ||||

Easter Break |
|||||

19. 4. 2023 | Context-Free Languages |
Earley's Algorithm | |||

20. 4. 2023 | Weighted Pushdown Automata | ||||

26. 4. 2023 | Newton's Algorithm for PDAs, Nullary Removal, Unary Removal | ||||

27. 4. 2023 | Dependency Languages |
Chomsky Normal Form for PDAs, Lang's Algorithm | |||

3. 5. 2023 | Bilexical Grammars and Dependency Parsing (cont.) | ||||

4. 5. 2023 | Regular Tree Languages |
Weighted Tree Automata | Regular Tree Languages | ||

10. 5. 2023 | Determinization of Weighted Tree Automata | ||||

11. 5. 2023 | Determinization of Weighted Tree Automata (cont.), Weight Pushing | ||||

17. 5. 2023 | Minimization of weighted Tree Automata | ||||

18. 5. 2023 | Tree-Adjoining Languages |
Weighted Tree-adjoining Grammars, Lang Normal Form | |||

24. 5. 2023 | Other Formalisms for Tree-adjoining Languages | ||||

25. 5. 2023 | Parsing Tree-adjoining Languages | ||||

31. 5. 2023 | The Weir Hierarchy | ||||

1. 6. 2023 | Context-Free Tree Languages |
Context-free Tree Languages, Context-free Tree Grammars |

## Organisation

### Live Chat

In addition to class time, there will also be a `RocketChat`

-based live chat hosted on ETH’s servers.
Students are free to ask questions of the teaching team and of others in public or private (direct message).
There are specific channels for each of the two assignments as well as for reporting errata in the course notes and slides.
All data from the chat will be deleted from ETH servers at the course’s conclusion.

**Important**: There are a few important points you should keep in mind about the course live chat:

`RocketChat`

will be the main communications hub for the course. You are responsible for receiving all messages broadcast in the`RocketChat`

.- Your username should be
`firstname.lastname`

. This is required as we will only allow enrolled students to participate in the chat and we will remove users which we cannot validate. **Tag**your questions as described in the document on How to use Rycolab Course RocketChat channels. The document also contains other general remarks about the use of`RocketChat`

.- Search for answers in the appropriate channels before posting a new question.
- Ask questions on public channels as much as possible.
- Answer to posts in
*threads*. - The chat supports
`LaTeX`

for easier discussion of technical material. See How to use`LaTeX`

in`RocketChat`

. - We highly recommend you download the desktop app here.

**This is the link** to the main channel.
To make the moderation of the chat more easily manageable, we have created a number of other channels on `RocketChat`

.
The full list is:

- AFLT General Channel for the general organisational discussions.
- AFLT Announcements Channel for the announcements by the teaching team.
- AFLT Content Questions for your questions about the content of the course.
**Important**: Please prepend your question with a “tag” about the content of your question in square brackets. For example, if your question is about the content of Lecture 2 and specifically about the definition of a semiring, please start your message with`[Lecture #1, Definition of a Semiring]`

. - AFLT Errata for reporting typos and errors in the course lecture notes and the slides.
- AFLT Assignment 1 for discussing and asking questions about Assignment 1.
- AFLT Assignment 2 for discussing and asking questions about Assignment 2.
- AFLT Assignment 3 for discussing and asking questions about Assignment 3.
- AFLT Assignment 4 for discussing and asking questions about Assignment 4.
- AFLT Assignment 5 for discussing and asking questions about Assignment 5.
- AFLT Assignment 6 for discussing and asking questions about Assignment 6.
- AFLT Assignment 7 for discussing and asking questions about Assignment 7.
- AFLT Assignment 8 for discussing and asking questions about Assignment 8.
- AFLT Assignment 9 for discussing and asking questions about Assignment 9.
- AFLT Assignment 10 for discussing and asking questions about Assignment 10.
- Find Assignment Partners for finding teammates for the course assignments.

If you feel like you would benefit from any other channel, feel free to suggest it to the teaching team!

### Course Notes

We prepared very detailed course notes last year.
We will be improving them throughout the semester as we go!
The individual chapters will be published in the course syllabus below and updated throughout the semester.
Please report all errata to the teaching staff; we really want to polish the notes this semester so any feedback, no matter how small, would be very appreciated—we created an errata channel in `RocketChat`

.

Other useful literature:

## Grading

There will be *no final exam* for this course.

Instead, we will release **10 course assignments** throughout the semester and you can complete an optional course project. You can then obtain your final grade in **two ways**:

**1.** You can obtain your final grade from **assignments only**, in which case we will calculate your final grade based on the **8** highest-scoring assignments you turn in (you can therefore turn in just 8 of the assignments, or optionally fewer if you are not aiming for the highest grade).

In that case, the final grade will be calculated as follows:

**100 %**Assignments*****

**2.** Alternatively, you can obtain your final grade from **assignments and a course project**. In that case, you only have to turn in

**4 assignments**and the

**course project**to be able to achieve the highest grade (your 4 highest-scoring assignments will count towards your final grade). See below for more information on the course project.

In this case, your final grade will be calculated as follows:

**50 %**Assignments*****and**50 %**Course Project

You *can* cooperate on and discuss the assignments with your peers, but you *must* write your own code and write up the solutions yourself.

***Important**: In either of the two options, there is a specific **condition** we impose on the assignments you turn in: **half** of your assignment have to come from the first “batch” of the assignments (Assignments 1—5), and **half** of your assignments have to come from the second “batch” of the assignments (Assignments 6—10).
This means that, if you are turning in 4 assignments, you have to turn in 2 assignments from the first batch and 2 assignments from the second batch, and if you are submitting 8 assignments, you have to turn in 4 assignments from the first batch and 4 assignments from the second batch.
This is to ensure you have a more holistic understanding of the entire course material.

We require the solutions to be properly typeset.
We recommend using `LaTeX`

(with `Overleaf`

), but `markdown`

files with `MathJax`

for the mathematical expressions are also fine.

### Assignments

Below you can find the assignment instructions.
Keep in mind that we published *last year’s* assignments to give you an idea of the difficulty level and the type of questions we will be asking.
The assignments can change slightly as we update them this year, however, *the questions themselves will not change*—we will only update the assignments to make them more clear and to fix any typos.

**Assignment instructions**:

- Assignment 1
- Assignment 2
- Assignment 3
- Assignment 4
- Assignment 5
- Assignment 6
- Assignment 7
- Assignment 8
- Assignment 9
- Assignment 10

**Important**: The deadline for the projects and all assignments is **shortly before the end of the Summer examination period**—**15. 8. 2023**.
The submissions will be done through the course Moodle page.
Keep in mind that due to how late this deadline is, we *cannot extend it*—we are counting on you to be organised and submit the work in time.

### Course Project

The course project is optional, but it is a great way to get a deeper understanding of the course material and to apply it to a real-world problem.
We expect you to complete the project in a group of **3-5** students—to make supervising the projects manageable, we will *not accept* groups with fewer than 3 students.

You can find more information on the course project and the detailed instructions for it in the Course project instructions.
To give early feedback, a *proposal* is due midway through the course (**15.4.2023**) and a *progress report* towards the end of the semester (**15.5.2023**).
See the course Moodle page for the submission links.

The teaching staff has compiled a list of recent papers, listed below, whose replication would make for a great project, but students should feel free to come up with other ideas as well.

## Contact

You can ask questions on the course chat server. Please post all content-related questions there, so others can see them and join the discussion. If you have questions which are not related to the course material and are not of general interest, please don’t hesitate to contact us directly, i.e., email Ryan with the TAs cc-ed.