Natural Language Processing
ETH Zürich, Fall 2022: Course catalog
The course constitutes an introduction to modern techniques in the field of natural language processing (NLP). Our primary focus is on the algorithmic aspects of structured NLP models. The course is self-contained and designed to complement other machine learning courses at ETH Zürich, e.g., Deep Learning (263-3210-00L) and Advanced Machine Learning (252-0535-00L). At some points in the course, familiarity with advanced algorithms, e.g., the contents of Algorithms Lab (263-0006-00L), and mathematical statistics, e.g., the contents of Fundamentals of Mathematical Statistics (401-3621-00L), will be useful. However, the necessary background knowledge can certainly be picked up in the context of the course, i.e., neither of the above-listed courses is a hard prerequisite. The course also has a strong focus on algebraic methods, e.g., semiring theory. In addition to machine learning, we also cover the linguistic background necessary for reading the NLP literature.
14. 9. 2022 Class website is online!
19. 9. 2022 Assignment 1 has been released! See the public github repository for the accompanying code.
25. 9. 2022 Assignment 2 has been released!
25. 9. 2022 Recordings Polybox has been created. The password can be found on Moodle.
12. 10. 2022 Assignment 3 has been released!
31. 10. 2022 Assignment 4 has been released!
31. 10. 2022 Assignment grading rubric has been released!
26. 11. 2022 Assignment 5 has been released!
05. 12. 2022 Assignment 6 has been released!
05. 01. 2022 Practice exams have been released!
On the Use of Class Time
There are two lecture slots for NLP. The first slot is on Monday from 12h to 14h. During this time, the main lecture will be given. The second slot is on Tuesday from 13h to 14h and will be used as a spill-over time if we did not get through all of the lecture material on Monday (this ensures that the class stays on track) and, time-permitting, the professor will work examples and hold an open-ended ask-me-anything-about-NLP session.
Both lectures will be given in the lecture hall HG F1 and live broadcast on Zoom; the password is available on the course Moodle page or in the live chat.
Lectures will be recorded. Recording will be uploaded to the course Polybox folder, whose password you can find on the course Moodle page as well.
Important: The ETH semester starts on Tuesday, September 20th, but the first lecture will take place on Monday, September 26th.
In addition to class time, there will also be a RocketChat-based live chat hosted on ETH’s servers. Students are free to ask questions of the teaching staff and of others in public or private (direct message). There are specific channels for each of the 6 assignments as well as for reporting errata in the course notes. All data from the chat will be deleted from ETH servers at the course’s conclusion. The chat supports LaTeX for easier discussion of technical material.
Important: There are three important points you should keep in mind about the course live chat:
- RocketChat will be the main communications hub for the course. You are responsible for receiving all messages broadcast in the RocketChat.
- Your username should be
firstname.lastname. This is required as we will only allow enrolled students to participate in the chat and we will remove users which we cannot validate.
- We highly recommend you download the desktop app here.
This is the link to the main channel. To make the moderation of the chat more easily manageable, we have created a number of other channels on RocketChat. The full list is:
- NLP General Channel
- NLP Errata
- Exercises Channel
- Assignment 1
- Assignment 2
- Assignment 3
- Assignment 4
- Assignment 5
- Assignment 6
- Course Project
- Find Project Partners
If you feel like you would benefit from any other channel, feel free to suggest it to the teaching team!
We are currently working on turning out class content into a book! The current draft of the book, i.e., the course notes, can be found here. Please report all errata to the teaching staff; we created an errata channel in RocketChat.
Other useful literature:
- Introduction to Natural Language Processing (Eisenstein)
- Deep Learning (Goodfellow, Bengio and Courville)
- AFLT Course Notes
Marks for the course will be determined by the following formula:
- 70% Final Exam
- 30% Assignment or Class Project
On the Final Exam
The final exam is comprehensive and should be assumed to cover all the material in the slides and class notes. About 50% of exam questions will be very similar (or even identical) to the theory portion of the class assignments. Thus, it behooves you to at least look at all the assignment questions while preparing for the final exam even if you do not turn them all in for a grade. Solutions for the assignments will be provided (they will be re-used every year), but the teaching staff can answer questions if you solve the problems ahead of time.
On the Class Assignments
There will be 6 assignments which will be released roughly every two weeks. We impose three firm deadlines for handing in your solutions:
- Assignment 1: November 15th
- Assignment 2: December 15th
- Assignments 3, 4, 5, and 6: January 15th
Only your highest-scoring 4 assignments will count towards your grade; each will be weighted equally. So, in principle, you may opt to not turn in 2 out of the 6 assignments without any effect on your grade. : Even though we plan to grade your submissions within one month, we advise you not to wait for your grades to be returned before you decide to tackle the next assignments. In essence, do not base your submission strategy on our grading estimates! The assignments will be graded according to the pre-determined Assignment grading rubric.
The class assignments were crafted to dovetail nicely with the lecture contents and, moreover, to complement the lectures through a more hands-on approach to the material. Each assignment has a theory portion, which will generally involve derivations or proofs related to the material, and a coding portion where you will implement a working model for one of the NLP tasks discussed in the lecture. The theory and the coding halves of the assignments will be weighed equally.
The code relating to the assignments will be published on the public github repository. You should fork the repository and pull the incoming changes whenever they are released.
The detailed instructions for the submission will be given in each assignment separately, but the submissions will always be through the course Moodle page. The submission links are:
We require the solutions to be properly typeset. We recommend using LaTeX (with Overleaf), but markdown files with MathJax for the mathematical expressions are also fine. We provide a template for the writeups here; however, feel free to use your own.
On the Discussion Sections
Discussion sections (tutorials) will take place Wednesdays 16:15-19:00 in HG F7 and on Zoom (same link as the lectures). Their main purpose will be to help you with the assignment problems. We plan to devote 2 discussion sessions (two weeks) to each of the assignments. In them, TAs will introduce the problems, solve related exercises, and answer your questions about them. We therefore strongly encourage you to look at the problems in due time and come to the discussions sessions with your questions. We want the sessions to be useful for you!
On the Class Project
It is highly recommended that you do the class assignments. However, a student (in groups of up to 4 people) may choose to do a course project in lieu of the class assignments. This option is only recommended for academically oriented students who are interested in using this course to get into NLP research. If you choose to do a class project, you submit a project proposal by October 31, 2022, on Moodle. The proposal is and will be inspected by the teaching assistants to ensure that the project is doable and you will pass the course should you execute the project as proposed. The write-up and code for the final project are due January 15, 2023; it is to be submitted through Moodle. General guidelines for the class project are given here.
Project work submission will be done on the course Moodle page. The submission links are:
|Week||Date||Topic||Slides||Readings||Supplementary Material||Material Exercise Sheets|
|1||26.09.22||Introduction to Natural Language||Lecture 1||Eisenstein Ch. 1|
|27.09.22||Introduction to Natural Language|
|2||03.10.22||Backpropagation||Lecture 2||Goodfellow, Bengio and Courville Ch. 6.5||Chris Olah's Blog Justin Domke’s Notes Tim Vieira’s Blog Moritz Hardt’s Notes Bauer (1974) Baur and Strassen (1983) Griewank and Walter (2008) Eisner (2016) Backpropagation Proof Computation Graph for MLP Computation Graph Example||Exercises|
|3||10.10.22||Log-Linear Modeling---Meet the Softmax||Lecture 3||Eisenstein Ch. 2||Ferraro and Eisner (2013) Jason Eisner’s list of further resources on log-linear modeling||Exercises|
|4||17.10.22||Sentiment Analysis with Multi-layer Perceptrons||Lecture 4||Eisenstein Ch. 3 and Ch. 4Goodfellow, Bengio and Courville Ch. 6||Exercises|
|18.10.22||Sentiment Analysis with Multi-layer Perceptrons|
|5||24.10.22||Language Modeling with n-grams and LSTMs||Lecture 5||Eisenstein Ch. 6Goodfellow, Bengio and Courville Ch. 10||Good Tutorial on n-gram smoothing Good–Turing Smoothing Kneser and Ney (1995) Bengio et al. (2003) Mikolov et al. (2010)||Exercises|
|25.10.22||Language Modeling with n-grams and LSTMs|
|6||31.10.22||Part-of-Speech Tagging with CRFs||Lecture 6||Eisenstein Ch. 7 and 8||Tim Vieira's Blog McCallum et al. (2000) Lafferty et al. (2001) Sutton and McCallum (2011) Koller and Friedman (2009)||Exercises|
|01.11.22 ONLINE ONLY||Part-of-Speech Tagging with CRFs|
|7||07.11.22||Transliteration with WFSTs||Lecture 7||Eisenstein Ch. 9||AFLT Course Notes Chapters 1, 2, and 3 Knight and Graehl (1998) Mohri, Pereira and Riley (2008)||Exercises|
|08.11.22||Transliteration with WFSTs|
|8||14.11.22||Context-Free Parsing with CKY||Lecture 8||Eisenstein Ch. 10||The Inside-Outside Algorithm Jason Eisner’s Slides Kasami (1966) Younger (1967) Cocke and Schwartz (1970)||Exercises|
|9||21.11.22||Dependency Parsing with the Matrix-Tree Theorem||Lecture 9||Eisenstein Ch. 11||Koo et al. (2007) Smith and Smith (2007) McDonald and Satta (2007) McDonald, Kübler and Nivre (2009)||Exercises|
|21.11.22||Dependency Parsing with the Matrix-Tree Theorem|
|10||28.11.22||Semantic Parsing with CCGs||Lecture 10||Eisenstein Ch. 9.3 and 12||Weir and Joshi (1988) Kuhlmann and Satta (2014) Mark Steedman's CCG slides||Exercises|
|29.11.22||Semantic Parsing with CCGs|
|11||5.12.22||Machine Translation with Transformers||Lecture 11||Eisenstein Ch. 18||Neural Machine Translation Vaswani et al. (2017) Rush (2018)||Exercises|
|6.12.22||Machine Translation with Transformers|
|12||12.12.22||Axes of Modeling||Lecture 12||Review: Eisenstein Ch. 2Goodfellow, Bengio and Courville Ch. 5 and 11||Exercises|
|13.12.22||Axes of Modeling|
|13||19.12.22||Bias and Fairness in NLP||Lecture 13||Bolukabasi et al. (2016) Gonen and Goldberg (2019) Hall Maudslay et al. (2019) Vargas and Cotterell (2020) A Course in Machine Learning Chapter 8|
|20.12.22||Bias and Fairness in NLP|
|1||28.09.22||Course Logistics and Introduction of the TA Team||All TAs||Introduction Slides|
|2||05.10.22||Assignment 1||Niklas Stoehr|
|3||12.10.22||Assignment 1||Niklas Stoehr|
|6||02.11.22||Assignment 2 and Assignment 3||David Wissel, Alexandra Butoi, and Anej Svete||Assignment 2 Slides|
|7||09.11.22||Assignment 2 and Assignment 3||David Wissel, Alexandra Butoi, and Anej Svete||Transliteration Slides|
|8||16.11.22||Assignment 4||Franz Nowak||Assignment 4 Slides Part 1|
|9||23.11.22||Assignment 4||Franz Nowak||Assignment 4 Slides Part 2|
|10||30.11.22||Assignment 5||Benjamin Dayan||Assignment 5 Slides|
|11||07.12.22||Assignment 6||Luca Malagutti||Assignment 6 Slides|
|12||14.12.22||Assignment 5||Benjamin Dayan|
|13||21.12.22||Assignment 6||Luca Malagutti|