Neural Networks and Computational Complexity

ETH Zürich: Fall 2024

Course Description

This Bachelor’s seminar delves into the fascinating world of modern large language models (LLMs), which have revolutionized natural language processing. As these models continue to evolve and impact various domains, we will explore their potential, limitations, and underlying mechanisms through a theoretical lens. Throughout the seminar, we will address the following key questions: what are the real capabilities of large language models? What are their inherent limitations? How do these models function at a fundamental level? Under what circumstances are they likely to fail? Can we develop a comprehensive “science of LLMs” to address these inquiries? We will leverage formal language theory to provide a rigorous framework for understanding the representational capacity of neural language models.

Time: Friday 14-16h

Location: CHN D 44

Additional Material

slides

Course Schedule (Work in Progress)

Week Date Topic Presenter Reading
1 20.09.24 Intro
2 27.09.24 Language Models & FLT Only Lecture
3 4.10.24 RNNs and FSAs Mary, Jakob, Pierre Svete et al. (2024), Svete et al. (2024),
4 11.10.24 Counter Machines and the LSTM Tom, Simon, Julius Weiss et al. (2017),
5 18.10.24 RNNs and Turing Machines Nowak et al. (2023), Siegelmann and Sontag (1992),
6 25.10.24 The Transformer Vaswani et al. (2017), Bahdanu et al. (2014),
7 1.11.24 The Transformer is Turing Complete Perez et al. (2017),
8 8.11.24 The Transformer is Turing (In)Complete Hahn (2020),

Lecturer

Avatar

Ryan Cotterell

Assistant Professor of Computer Science

ETH Zürich

Teaching Assistant Neural Networks and Computational Complexity