Neural Networks and Computational Complexity

ETH Zürich: Fall 2024

Course Description

This Bachelor’s seminar delves into the fascinating world of modern large language models (LLMs), which have revolutionized natural language processing. As these models continue to evolve and impact various domains, we will explore their potential, limitations, and underlying mechanisms through a theoretical lens. Throughout the seminar, we will address the following key questions: what are the real capabilities of large language models? What are their inherent limitations? How do these models function at a fundamental level? Under what circumstances are they likely to fail? Can we develop a comprehensive “science of LLMs” to address these inquiries? We will leverage formal language theory to provide a rigorous framework for understanding the representational capacity of neural language models.

Time: Friday 14-16h

Location: CHN D 44

Additional Material

slides

IMPORTANT!

When you send an e-mail, please ALWAYS put “Bachelor’s Seminar” in the object!

Course Schedule (Work in Progress)

Week Date Topic Presenter Reading
1 20.09.24 Intro
2 27.09.24 Language Models & FLT Only Lecture
3 4.10.24 RNNs and FSAs Mary, Jakob, Pierre Svete et al. (2024), Svete et al. (2024),
4 11.10.24 Counter Machines and the LSTM Tom, Simon, Julius Weiss et al. (2017),
5 18.10.24 RNNs and Turing Machines Sasha, Ben, Torban Nowak et al. (2023), Siegelmann and Sontag (1992),
6 25.10.24 The Transformer Sarah, Alexander, Leon Vaswani et al. (2017), Bahdanu et al. (2014),
7 1.11.24 The Transformer is Turing Complete Perez et al. (2017),
8 8.11.24 EMNLP paper Pasti et al. (20204)
9 15.11.24 No Lecture (EMNLP)
8 22.11.24 The Transformer is Turing (In)Complete Mischa,Renne,Jesse Hahn (2020)
10 29.11.24 The Transformer with Chain of Thought Simon, Christian Merril and Sabharwal (2024)
11 6.12.24 Circuit Complexity of The Transformer (I) Aurelian, Erdem Strobl et al. (2024) (Survey)
12 13.12.24 Circuit Complexity of The Transformer (II) Raphael Hao et al. (2022)
12 20.12.24 What can Attention Learn? Meri Yau et al. (2024)

Lecturer

Avatar

Ryan Cotterell

Assistant Professor of Computer Science

ETH Zürich

Teaching Assistant Neural Networks and Computational Complexity