Natural Language Processing
ETH Zürich, Fall 2023: Course catalog
Course Description
The course constitutes an introduction to modern techniques in the field of natural language processing (NLP). Our primary focus is on the algorithmic aspects of structured NLP models. The course is selfcontained and designed to complement other machine learning courses at ETH Zürich, e.g., Deep Learning (263321000L) and Advanced Machine Learning (252053500L). At some points in the course, familiarity with advanced algorithms, e.g., the contents of Algorithms Lab (263000600L), and mathematical statistics, e.g., the contents of Fundamentals of Mathematical Statistics (401362100L), will be useful. However, the necessary background knowledge can certainly be picked up in the context of the course, i.e., neither of the abovelisted courses is a hard prerequisite. The course also has a strong focus on algebraic methods, e.g., semiring theory. In addition to machine learning, we also cover the linguistic background necessary for reading the NLP literature.
News
06.09.2023 Class website is online!
27.09.2023 Assignment 1 has been released!
03.10.2023 Assignment 2 has been released!
12.10.2023 Assignment 3 has been released!
25.10.2023 Assignment 4 has been released!
17.11.2023 Assignment 5 has been released!
27.11.2023 Assignment 6 has been released!
16.01.2023 New Practice Exam has been released!
18.01.2023 The Practice Exam Solutions have been released!
Organisation
On the Use of Class Time
There are two lecture slots for NLP. The first slot is on Monday from 12h to 14h. During this time, the main lecture will be given. The second slot is on Tuesday from 13h to 14h and will be used as a spillover time if we did not get through all of the lecture material on Monday (this ensures that the class stays on track) and, timepermitting, the professor will work examples and hold an openended askmeanythingaboutNLP session.
Zoom Link and Recordings
Both lectures will be given in the lecture hall HG F1 and live broadcast on Zoom; the password is available on the course Moodle page.
Lectures will be recorded. You can find the links to the recordings on the course Moodle page.
Important: The ETH semester starts on Tuesday, September 18th, but the first lecture will take place on Monday, September 25th.
Live Chat
In addition to class time, there will also be a RocketChatbased live chat hosted on ETH’s servers. Students are free to ask questions of the teaching staff and of others in public or private (direct message). There are specific channels for each of the 6 assignments as well as for reporting errata in the course notes. All data from the chat will be deleted from ETH servers at the course’s conclusion. The chat supports LaTeX for easier discussion of technical material.
Important: There are a few important points you should keep in mind about the course live chat:
RocketChat
will be the main communications hub for the course. You are responsible for receiving all messages broadcast in theRocketChat
. Your username should be
firstname.lastname
. This is required as we will only allow enrolled students to participate in the chat and we will remove users which we cannot validate.  Tag your questions as described in the document on How to use Rycolab Course RocketChat channels. The document also contains other general remarks about the use of
RocketChat
.  Search for answers in the appropriate channels before posting a new question.
 Ask questions on public channels as much as possible.
 Answer to posts in threads.
 The chat supports
LaTeX
for easier discussion of technical material. See How to useLaTeX
inRocketChat
.  We highly recommend you download the desktop app here.
This is the link to the main channel. To make the moderation of the chat more easily manageable, we have created a number of other channels on RocketChat. The full list is:
 General Channel for the general organisational discussions.
 Announcements Channel for the announcements by the teaching team.
 Content Questions Channel for your questions about the content of the course.
 Errata Channel for reporting typos and errors in the course lecture notes and the slides.
 Assignment 1 Channel
 Assignment 2 Channel
 Assignment 3 Channel
 Assignment 4 Channel
 Assignment 5 Channel
 Assignment 6 Channel
 Channel for Finding Assignment/Project Partners for finding teammates for the course assignments and the project.
If you feel like you would benefit from any other channel, feel free to suggest it to the teaching team!
Course Notes
We are currently working on turning out class content into a book! The current draft of the book, i.e., the course notes, can be found here. Please report all errata to the teaching staff; we created an errata channel in RocketChat.
Other useful literature:
 Introduction to Natural Language Processing (Eisenstein)
 Deep Learning (Goodfellow, Bengio and Courville)
 LLM Course Notes
 AFLT Course Notes
Grading
Marks for the course will be determined by the following formula:
 70% Final Exam
 30% Assignment or Class Project
On the Final Exam
This year’s exam will take place on 27 January at 11:30. The final exam is comprehensive and should be assumed to cover all the material in the slides and class notes. About 50% of exam questions will be very similar (or even identical) to the theory portion of the class assignments. Thus, it behooves you to at least look at all the assignment questions while preparing for the final exam even if you do not turn them all in for a grade. Solutions for the assignments will not be provided (they will be reused every year), but the teaching staff can answer questions if you solve the problems ahead of time.
On the Class Assignments
There will be 6 assignments which will be released (in their final form) roughly every two weeks. We impose three firm deadlines for handing in your solutions:
 Assignment 1: November 15th
 Assignments 2, 3, 4, 5, and 6: January 15th
Only your highestscoring 4 assignments will count towards your grade; each will be weighted equally. So, in principle, you may opt to not turn in 2 out of the 6 assignments without any effect on your grade. Note: Even though we plan to grade your submissions within one month, we advise you not to wait for your grades to be returned before you decide to tackle the next assignments. In essence, do not base your submission strategy on our grading estimates! The assignments will be graded according to the predetermined Assignment grading rubric.
The class assignments were crafted to dovetail nicely with the lecture contents and, moreover, to complement the lectures through a more handson approach to the material. Each assignment has a theory portion, which will generally involve derivations or proofs related to the material, and a coding portion where you will implement a working model for one of the NLP tasks discussed in the lecture. The theory and the coding halves of the assignments will be weighted equally.
Assignment sheets:
The code relating to some of the assignments will be published on the public github repository. You should fork the repository and pull the incoming changes whenever they are released.
Very important: We require the solutions to be properly typeset. Handwritten solutions will not be accepted. We recommend using LaTeX (with Overleaf), but markdown files with MathJax for the mathematical expressions are also fine. We provide a template for the writeups here; however, feel free to use your own.
Additionally, the solutions have to be presented in a clean and readable way, with all substeps of the solutions presented in a logical order. Note that this does not mean that your submissions have to be overly verbose and long. It simply means that you should explain your reasoning and the steps of your solutions in a clear and concise way. To encourage this, we will, for every assignment, award 2 additional points for properly explained and formatted solutions.
The detailed instructions for the submission will be given in each assignment separately, but the submissions will always be through the course Moodle page. The submission links are:
 Assignment 1 Submission
 Assignment 2 Submission
 Assignment 3 Submission
 Assignment 4 Submission
 Assignment 5 Submission
 Assignment 6 Submission
On the Discussion Sections
Discussion sections (tutorials) will take place Wednesdays 16h to 19h in HG F7 and on Zoom (same link as the lectures). Their main purpose will be to solve some exercises with you that will help you grasp the concepts from the lecture and to help you prepare for the exam. They will also help you with the assignment problems. Roughly, we expect to devote 2 hours per week to exercises and 1 hour to the assignments. We therefore strongly encourage you to look at the assignment problems in due time and come to the discussions sessions with your questions. We want the sessions to be useful for you!
On the Class Project
It is highly recommended that you do the class assignments. However, students may choose to do a course project (in groups of up to 4 people) in lieu of the class assignments. This option is only recommended for academically oriented students who are interested in using this course to get into NLP research. If you choose to do a class project, you must submit a project proposal by October 31, 2023, on Moodle. The proposal is ungraded and will be inspected by the teaching assistants to ensure that the project is doable and you will pass the course should you execute the project as proposed. The writeup and code for the final project are due January 15, 2024; it is to be submitted through Moodle. General guidelines for the class project are given here.
Project work submission will be done on the course Moodle page. The submission links are:
Syllabus
Week  Date  Topic  Slides  Readings  Supplementary Material  Material Exercise Sheets 

1  25.9.2023  Introduction to Natural Language  Lecture 1  Eisenstein Ch. 1  
26.9.2023  Course logistics, Introduction of the TA team  TA Introduction  
2  2.10.2023  Backpropagation  Lecture 2  Goodfellow, Bengio and Courville Ch. 6.5 
Chris Olah's Blog Justin Domke’s Notes Tim Vieira’s Blog Moritz Hardt’s Notes Bauer (1974) Baur and Strassen (1983) Griewank and Walter (2008) Eisner (2016) Backpropagation Proof Computation Graph for MLP Computation Graph Example 
Week 2 Exercises Week 2 Solutions 
3.10.2023  Backpropagation  Backpropagation Tutorial  
3  9.10.2023  LogLinear ModelingMeet the Softmax  Lecture 3  Eisenstein Ch. 2  Ferraro and Eisner (2013) Jason Eisner’s list of further resources on loglinear modeling 
Week 3 Exercises Week 3 Solutions 
10.10.2023  LogLinear ModelingMeet the Softmax  
4  16.10.2023  Sentiment Analysis with Multilayer Perceptrons  Lecture 4  Eisenstein Ch. 3 and 4; Goodfellow, Bengio and Courville Ch. 6 
Week 4 Exercises Week 4 Solutions 

17.10.2023  Sentiment Analysis with Multilayer Perceptrons  
5  23.10.2023  Language Modeling with ngrams and LSTMs  Lecture 5  Eisenstein Ch. 6; Goodfellow, Bengio and Courville Ch. 10 
Good Tutorial on ngram smoothing Good–Turing Smoothing Kneser and Ney (1995) Bengio et al. (2003) Mikolov et al. (2010) 
Week 5 Exercises Week 5 Solutions 
24.10.2023  Language Modeling with ngrams and LSTMs  
6  30.10.2023  PartofSpeech Tagging with CRFs  Lecture 6  Eisenstein Ch. 7 and 8  Tim Vieira's Blog McCallum et al. (2000) Lafferty et al. (2001) Sutton and McCallum (2011) Koller and Friedman (2009) 
Week 6 Exercises Week 6 Solutions 
31.10.2023  PartofSpeech Tagging with CRFs, Assignment 2 introduction  Assignment 2 Slides  
7  6.11.2023  Transliteration with WFSTs  Lecture 7  Eisenstein Ch. 9  AFLT Course Notes Chapters 1, 2, and 3 Knight and Graehl (1998) Mohri, Pereira and Riley (2008) 
Week 7 Exercises Week 7 Solutions 
7.11.2023  Transliteration with WFSTs  
8  13.11.2023  ContextFree Parsing with CKY  Lecture 8  Eisenstein Ch. 10  The InsideOutside Algorithm Jason Eisner’s Slides Kasami (1966) Younger (1967) Cocke and Schwartz (1970) 
Week 8 Exercises Week 8 Solutions 
14.11.2023  ContextFree Parsing with CKY  
9  20.11.2023  Dependency Parsing with the MatrixTree Theorem  Lecture 9  Eisenstein Ch. 11  Koo et al. (2007) Smith and Smith (2007) McDonald and Satta (2007) McDonald, Kübler and Nivre (2009) 
Week 9 Exercises Week 9 Solutions 
21.11.2023  Dependency Parsing with the MatrixTree Theorem  
10  27.11.2023  Semantic Parsing with CCGs  Lecture 10  Eisenstein Ch. 9.3 and 12  Weir and Joshi (1988) Kuhlmann and Satta (2014) Mark Steedman's CCG slides 
Week 10 Exercises Week 10 Solutions 
28.11.2023  Semantic Parsing with CCGs  
11  4.12.2023  Machine Translation with Transformers  Lecture 11  Eisenstein Ch. 18 
Vaswani et al. (2017) The Annotated Transformer The Illustrated Transformer The Transformer Family 
Week 11 Exercises Week 11 Solutions 
5.12.2023  Machine Translation with Transformers  
12  11.12.2023  Axes of Modeling  Lecture 12  Review Eisenstein Ch. 2; Goodfellow, Bengio and Courville Ch. 5 and 11 
Week 12 Exercises Week 12 Solutions 

12.12.2023  Axes of Modeling  
13  18.12.2023  Bias and Fairness in NLP  Lecture 13  Bolukabasi et al. (2016) Gonen and Goldberg (2019) Hall Maudslay et al. (2019) Vargas and Cotterell (2020) A Course in Machine Learning Chapter 8 

19.12.2023  Bias and Fairness in NLP 
Tutorial Schedule
Week  Date  Topic  Teaching Assistant  Material 

1  27.9.2023  No tutorial  
2  4.10.2023  Backpropagation, Assignment 1 introduction  Niklas Stoehr, Leonardo Nevali  
3  11.10.2023  Assignment 1 office hours  Niklas Stoehr, Leonardo Nevali  
4  18.10.2023  LogLinear Modeling  David Wissel  
5  25.10.2023  Sentiment Classification with Multilayer Perceptrons  Luca Malagutti  
6  1.11.2023  Language Modeling with ngrams and LSTMs  Vasiliki Xefteri  Week 5 Exercise Slides 
7  8.11.2023  Partofspeech Tagging with CRFs, Assignment 2 office hours  Leonardo Nevali, David Wissel  Assignment 2 Slides 
8  15.11.2023  Formal Language Theory and Transliteration with WFST, Assignment 3 introduction  Franz Nowak, Vasiliki Xefteri  Assignment 3 Slides 
9  22.11.2023  Contextfree Parsing, Assignment 4 introduction, Assignment 3 office hours  Maximilian Schneiderbauer, Alexandra Butoi, Franz Nowak, Vasiliki Xefteri  Assignment 4 Slides 
10  29.11.2023  Dependency Parsing, Assignment 5 introduction, Assignment 4 office hours  Tianyu Liu, Eleftheria Tsipidi, Alexandra Butoi, Maximilian Schneiderbauer  Assignment 5 Slides 
11  6.12.2023  Semantic Parsing, Assignment 5 office hours  Giovanni Acampa, Alexandra Butoi, Tianyu Liu, Eleftheria Tsipidi  
12  13.12.2023  Machine Translation with Transformers, Assignment 6 introduction  Luca Malagutti, Giovanni Acampa  Assignment 6 Slides (last year) 
13  20.12.2023  Axes of Modeling, Assignment 6 office hours  Luca Malagutti, Giovanni Acampa 