Large Language Models, Spring 2024
ETH Zürich: Course catalog
Course Description
Large language models have become one of the most commonly deployed NLP inventions. In the past half-decade, their integration into core natural language processing tools has dramatically increased the performance of such tools, and they have entered the public discourse surrounding artificial intelligence. In this course, we offer a self-contained introduction to language modeling and its applications. We start with the probabilistic foundations of language models, i.e., covering what constitutes a language model from a formal, theoretical perspective. We then discuss how to construct and curate training corpora, and introduce many of the neural-network architectures often used to instantiate language models at scale. The course covers aspects of systems programming, discussion of privacy and harms, as well as applications of language models in NLP and beyond.
News
27. 12. 2023 Class website is online!
2. 3. 2024 Assignment 1 Submission Template released.
Syllabus and Schedule
On the Use of Class Time
Lectures
There are two lecture slots for LLM each week: the first one on Tuesdays 14-16 in HG E 3 and the second one on Fridays 10-11 in CAB G 61.
Both lectures will be given in person and live broadcast on Zoom; the password is available on the course Moodle page.
Lectures will be recorded—links to the Zoom recordings will be posted on the course Moodle page.
Discussion Sections
Discussion sections (tutorials) will take place Thursdays 16-18 in NO C 60 and on Zoom (same link as the lectures).
Syllabus
Disclaimer: The syllabus is based on the topics from Spring 2023 and is subject to change.
Tutorial Schedule
Week | Date | Topic | Teaching Assistant | Material |
---|---|---|---|---|
1 | 22. 2. 2024 | Course Logistics (1 hour) | Anej Svete | Introduction Slides |
2 | 29. 2. 2024 | Fundamentals of Natural Language Processing and Language Modeling, Measure Theory, Generation |
Giovanni Acampa | Exercises, Exercises with solutions |
3 | 7. 3. 2024 | Classical Language Models: $n$-grams and Context-free Grammars | Vasiliki Xefteri | Exercises, Exercises with solutions |
4 | 14. 3. 2024 | RNN Language Models | Valentin Bieri | Exercises, Exercises with solutions |
5 | 21. 3. 2024 | Transformer Language Models | Josep Borrell Tatché | Exercises, Exercises with solutions, Jupyter Notebook |
6 | 28. 3. 2024 | Tokenization and Generation | Manuel de Prada Corral | Exercises, Exercises with solutions, Slides |
7 | 11. 4. 2024 | Assignment 1 Q&A | TAs | |
8 | 18. 4. 2024 | Common pre-trained language models, Parameter-efficient fine-tuning | Evžen Wybitul | Google Colab Notebook, Transformer Architecture Drawing |
9 | 25. 4. 2024 | Retrieval-augmented generation | Pep Borrell | Google Colab Notebook, Slides |
10 | 2. 5. 2024 | Prompting, Chain-of-Thought Reasoning | Filippo Ficarra | Exercises, Exercises with solutions |
11 | 9. 5. 2024 | No Tutorial | ||
12 | 16. 5. 2024 | Decoding and Watermarking | Iason Chalas | Exercises, Exercises with solutions |
13 | 23. 5. 2024 | Assignment 2 Q&A | TAs | |
14 | 30. 5. 2024 | Assignment 3 Q&A | TAs |
Organisation
Live Chat
In addition to class time, there will also be a RocketChat
-based live chat hosted on ETH’s servers.
Students are free to ask questions of the teaching staff and of others in public or private (direct message).
There are specific channels for each of the assignments as well as for reporting errata in the course notes and slides.
All data from the chat will be deleted from ETH servers at the course’s conclusion.
Important: There are a few important points you should keep in mind about the course live chat:
RocketChat
will be the main communications hub for the course. You are responsible for receiving all messages broadcast in theRocketChat
.- Your username should be
firstname.lastname
. This is required as we will only allow enrolled students to participate in the chat and we will remove users which we cannot validate. - Tag your questions as described in the document on How to use Rycolab Course RocketChat channels. The document also contains other general remarks about the use of
RocketChat
. - Search for answers in the appropriate channels before posting a new question.
- Ask questions on public channels as much as possible.
- Answer to posts in threads.
- The chat supports
LaTeX
for easier discussion of technical material. See How to useLaTeX
inRocketChat
. - We highly recommend you download the desktop app here.
This is the link to the main channel.
To make the moderation of the chat more easily manageable, we have created a number of other channels on RocketChat
.
The full list is:
- General Channel for the general organisational discussions.
- Announcements Channel for the announcements by the teaching team.
- Content Questions for your questions about the content of the course.
- Errata for reporting typos and errors in the course lecture notes and the slides.
- Assignment 1 for asking questions and discussing the first assignment.
- Assignment 2a for asking questions and discussing the assignment 2a.
- Assignment 2b for asking questions and discussing the assignment 2b.
- Find Assignment Partners for finding teammates for the course assignments.
If you feel like you would benefit from any other channel, feel free to suggest it to the teaching team!
Course Notes
We prepared an extensive set of course notes for the course last semester.
We will be improving them as we go this semester as well.
Please report all errata to the teaching staff; we created an errata channel in RocketChat
.
Links to the course notes:
- LLM Course Notes Part 1
- iPad Notes Part 1 (Anej)
- LLM Course Notes Part 2 (last year)
- LLM Course Notes Part 2 (up to date Overleaf link)
Other useful literature:
- iPad class notes (last year)
- Introduction to Natural Language Processing (Eisenstein)
- Deep Learning (Goodfellow, Bengio and Courville)
- AFLT Course Notes
Grading
Marks for the course will be determined by the following formula:
- 50% Final Exam
- 50% Assignments
On the Final Exam
The final exam is comprehensive and should be assumed to cover all the material in the slides and class notes.
On the Class Assignments
There will be two larger assignments in the course, the second of which will be split into two parts.
We require the solutions to be properly typeset.
We recommend using LaTeX
(with Overleaf
), but markdown
files with something like MathJax
for the mathematical expressions are also fine.
The first assignment will be of more theoretical nature and will be released shortly after the start of the semester. Assignments 2a and 2b will be of more practical nature and will be released in the second half of the semester.
Assignment instructions sheets:
- Assignment 1 Instructions
- Assignment 1 Submission Template.
While not strictly necessary, we highly advise you use this template when preparing your submission. It also includes a large number of LaTeX macros which can make your writing faster and easier to read.
Important: Even if you don’t use this template, you should copy the Declaration of originality from the front page into your own submission!
- Assignment 2a Instructions
- Assignment 2b Instructions
Assignment Deadlines
Assignment 1 is due on Tuesday, April 30th at 23:59. Assignment 2a is due on Sunday, June 30th at 23:59. Assignment 2b is due on Sunday, June 30th at 23:59.