Natural Language Processing

ETH Zürich, Fall 2025: Course catalog

Course Description

The course constitutes an introduction to modern techniques in the field of natural language processing (NLP). Our primary focus is on the algorithmic aspects of structured NLP models. The course is self-contained and designed to complement other machine learning courses at ETH Zürich, e.g., Deep Learning (263-3210-00L) and Advanced Machine Learning (252-0535-00L). At some points in the course, familiarity with advanced algorithms, e.g., the contents of Algorithms Lab (263-0006-00L), and mathematical statistics, e.g., the contents of Fundamentals of Mathematical Statistics (401-3621-00L), will be useful. However, the necessary background knowledge can certainly be picked up in the context of the course, i.e., neither of the above-listed courses is a hard prerequisite. The course also has a strong focus on algebraic methods, e.g., semiring theory. In addition to machine learning, we also cover the linguistic background necessary for reading the NLP literature.

News

15.09.2025   Class website is online!

Organisation

On the Use of Class Time

There are two lecture slots for NLP. The first slot is on Monday from 12h to 14h. During this time, the main lecture will be given. The second slot is on Tuesday from 13h to 14h and will be used as a spill-over time if we did not get through all of the lecture material on Monday (this ensures that the class stays on track) and, time-permitting, the professor will work examples and hold an open-ended ask-me-anything-about-NLP session.

Live Chat

In addition to class time, there will also be a RocketChat-based live chat hosted on ETH’s servers. Students are free to ask questions of the teaching staff and of others in public or private (direct message). There are specific channels for each of the 6 assignments as well as for reporting errata in the course notes. All data from the chat will be deleted from ETH servers at the course’s conclusion. The chat supports LaTeX for easier discussion of technical material.

Important: There are a few important points you should keep in mind about the course live chat:

  1. RocketChat will be the main communications hub for the course. You are responsible for receiving all messages broadcast in the RocketChat.
  2. Your username should be firstname.lastname. This is required as we will only allow enrolled students to participate in the chat and we will remove users which we cannot validate.
  3. Tag your questions as described in the document on How to use Rycolab Course RocketChat channels. The document also contains other general remarks about the use of RocketChat.
  4. Search for answers in the appropriate channels before posting a new question.
  5. Ask questions on public channels as much as possible.
  6. Answer to posts in threads.
  7. The chat supports LaTeX for easier discussion of technical material. See How to use LaTeX in RocketChat.
  8. We highly recommend you download the desktop app here.

This is the link to the main channel. To make the moderation of the chat more easily manageable, we have created a number of other channels on RocketChat. The full list is:

If you feel like you would benefit from any other channel, feel free to suggest it to the teaching team!

Course Notes

We are currently working on turning out class content into a book! The current draft of the book, i.e., the course notes, can be found here. Please report all errata to the teaching staff; we created an errata channel in RocketChat.

Other useful literature:

Grading

Marks for the course will be determined by the following formula:

  • 70% Final Exam
  • 30% Assignment or Class Project

On the Final Exam

The final exam is comprehensive and should be assumed to cover all the material in the slides and class notes. About 50% of exam questions will be very similar (or even identical) to the theory portion of the class assignments. Thus, it behooves you to at least look at all the assignment questions while preparing for the final exam even if you do not turn them all in for a grade. Solutions for the assignments will not be provided (they will be re-used every year), but the teaching staff can answer questions if you solve the problems ahead of time.

Assignment sheets:

The code relating to some of the assignments will be published on the public github repository. You should fork the repository and pull the incoming changes whenever they are released.

Very important: We require the solutions to be properly typeset. Handwritten solutions will not be accepted. We recommend using LaTeX (with Overleaf), but markdown files with MathJax for the mathematical expressions are also fine. We provide a template for the writeups here; however, feel free to use your own.

Additionally, the solutions have to be presented in a clean and readable way, with all sub-steps of the solutions presented in a logical order. Note that this does not mean that your submissions have to be overly verbose and long. It simply means that you should explain your reasoning and the steps of your solutions in a clear and concise way. To encourage this, we will, for every assignment, award 2 additional points for properly explained and formatted solutions.

The detailed instructions for the submission will be given in each assignment separately.

On the Tutorials

Tutorials will take place Wednesdays 16h to 19h in HG F7. Their main purpose will be to solve some exercises with you that will help you grasp the concepts from the lecture and to help you prepare for the exam. They will also introduce new assignments and allow you to ask questions about them. Roughly, we expect to devote 2 hours per week to exercises and 1 hour to the assignments (when a new assignment has been released). We therefore strongly encourage you to look at the assignment problems in due time and come to the discussions sessions with your questions. We want the sessions to be useful for you!

Assignment Office Hours

In addition to the Tutorials, we will hold assignment-specific online office hours on Zoom about 2 weeks after the assignment has been introduced. You will have the opportunity to talk to the TAs responsible for that assignment and ask individual questions you do not want to discuss on a public RocketChat channel. We will send out 10 minute slots for you to sign up for closer to the time on the corresponding assignment RocketChat channels.

Syllabus

Week Date   Topic Slides   Readings Supplementary Material Material Exercise Sheets
0 16.9.2025 No lecture
1 22.9.2025 Introduction to NLP, Course logistics,
Introduction of the TA team
(last year) Lecture 1 Eisenstein Ch. 1
23.9.2025 Introduction to NLP
2 30.9.2025 Backpropagation (last year) Lecture 2 Goodfellow, Bengio and Courville Ch. 6.5 Chris Olah's Blog
Justin Domke’s Notes
Tim Vieira’s Blog
Moritz Hardt’s Notes
Bauer (1974)
Baur and Strassen (1983)
Griewank and Walter (2008)
Eisner (2016)
Backpropagation Proof
Computation Graph for MLP
Computation Graph Example
3 6.10.2025 Log-Linear Modeling---Meet the Softmax (last year) Lecture 3 Eisenstein Ch. 2 Ferraro and Eisner (2013)
Jason Eisner’s list of further resources on log-linear modeling
7.10.2025 Log-Linear Modeling---Meet the Softmax
4 13.10.2025 Sentiment Analysis with Multi-layer Perceptrons (last year) Lecture 4 Eisenstein Ch. 3 and 4;
Goodfellow, Bengio and Courville Ch. 6
14.10.2025 Sentiment Analysis with Multi-layer Perceptrons
5 20.10.2025 Language Modeling with n-grams and LSTMs (last year) Lecture 5 Eisenstein Ch. 6;
Goodfellow, Bengio and Courville Ch. 10
Good Tutorial on n-gram smoothing
Good–Turing Smoothing
Kneser and Ney (1995)
Bengio et al. (2003)
Mikolov et al. (2010)
21.10.2025 Language Modeling with n-grams and LSTMs
6 27.10.2025 Part-of-Speech Tagging with CRFs (last year) Lecture 6 Eisenstein Ch. 7 and 8 Tim Vieira's Blog
McCallum et al. (2000)
Lafferty et al. (2001)
Sutton and McCallum (2011)
Koller and Friedman (2009)
28.10.2025 Part-of-Speech Tagging with CRFs, Assignment 2 introduction
7 3.11.2025 Transliteration with WFSTs (last year) Lecture 7 Eisenstein Ch. 9 AFLT Course Notes Chapters 1, 2, and 3
Knight and Graehl (1998)
Mohri, Pereira and Riley (2008)
4.11.2025 Transliteration with WFSTs
8 10.11.2025 Context-Free Parsing with CKY (last year) Lecture 8 Eisenstein Ch. 10 The Inside-Outside Algorithm
Jason Eisner’s Slides
Kasami (1966)
Younger (1967)
Cocke and Schwartz (1970)
11.11.2025 Context-Free Parsing with CKY
9 17.11.2025 Dependency Parsing with the Matrix-Tree Theorem (last year) Lecture 9 Eisenstein Ch. 11 Koo et al. (2007)
Smith and Smith (2007)
McDonald and Satta (2007)
McDonald, Kübler and Nivre (2009)
18.11.2025 Dependency Parsing with the Matrix-Tree Theorem
10 24.11.2025 Semantic Parsing with CCGs (last year) Lecture 10 Eisenstein Ch. 9.3 and 12 Weir and Joshi (1988)
Kuhlmann and Satta (2014)
Mark Steedman's CCG slides
25.11.2025 Semantic Parsing with CCGs
11 1.12.2025 Machine Translation with Transformers (last year) Lecture 11 Eisenstein Ch. 18 Vaswani et al. (2017)
The Annotated Transformer
The Illustrated Transformer
The Transformer Family
2.12.2025 Machine Translation with Transformers
12 8.12.2025 Axes of Modeling (last year) Lecture 12 Review Eisenstein Ch. 2;
Goodfellow, Bengio and Courville Ch. 5 and 11
9.12.2025 Axes of Modeling
13 15.12.2025 Bias and Fairness in NLP (last year) Lecture 13 Bolukabasi et al. (2016)
Gonen and Goldberg (2019)
Hall Maudslay et al. (2019)
Vargas and Cotterell (2020)
A Course in Machine Learning Chapter 8
16.12.2025 Bias and Fairness in NLP

Tutorial Schedule

Week Date   Topic Teaching Assistant Material
1 17.9.2025 No Tutorial
2 24.9.2025 No tutorial
3 1.10.2025 No tutorial
4 8.10.2025 Backpropagation, Assignment 1 introduction Blanka
5 15.10.2025 Log-Linear Modeling Tianyu
5 22.10.2025 Sentiment Classification with Multi-layer Perceptrons Eleftheria
6 29.10.2025 Language Modeling with n-grams and LSTMs Irene
7 05.11.2025 Part-of-speech Tagging with CRFs, Assignment 2 introduction Pawel
8 12.11.2025 Transliteration with WFSTs, Assignment 3 introduction Tu
10 19.11.2025 Context-free Parsing, Assignment 4 introduction Franz
11 26.11.2025 Dependency Parsing with the Matrix-Tree Theorem Blanka
12 3.12.2025 Semantic Parsing with CCGs Alexandra
13 10.12.2025 Machine Translation with Transformers, Assignment 6 introduction Karolina
14 18.12.2025 Axes of Modeling Tu

Practice Exams

Older Practice Exams

Lecturer

Avatar

Ryan Cotterell

Assistant Professor of Computer Science

ETH Zürich