Neural Networks and Computational Complexity

ETH Zürich: Fall 2024

Course Description

This Bachelor’s seminar delves into the fascinating world of modern large language models (LLMs), which have revolutionized natural language processing. As these models continue to evolve and impact various domains, we will explore their potential, limitations, and underlying mechanisms through a theoretical lens. Throughout the seminar, we will address the following key questions: what are the real capabilities of large language models? What are their inherent limitations? How do these models function at a fundamental level? Under what circumstances are they likely to fail? Can we develop a comprehensive “science of LLMs” to address these inquiries? We will leverage formal language theory to provide a rigorous framework for understanding the representational capacity of neural language models.

Time: Friday 14-16h

Location: CHN D 44

Additional Material

slides

IMPORTANT!

When you send an e-mail, please ALWAYS put “Bachelor’s Seminar” in the object!

Course Schedule (Work in Progress)

Week	Date	Topic	Presenter	Reading
1	20.09.24	Intro
2	27.09.24	Language Models & FLT	Only Lecture
3	4.10.24	RNNs and FSAs	Mary, Jakob, Pierre	Svete et al. (2024), Svete et al. (2024),
4	11.10.24	Counter Machines and the LSTM	Tom, Simon, Julius	Weiss et al. (2017),
5	18.10.24	RNNs and Turing Machines	Sasha, Ben, Torban	Nowak et al. (2023), Siegelmann and Sontag (1992),
6	25.10.24	The Transformer	Sarah, Alexander, Leon	Vaswani et al. (2017), Bahdanu et al. (2014),
7	1.11.24	The Transformer is Turing Complete		Perez et al. (2017),
8	8.11.24	EMNLP paper		Pasti et al. (20204)
9	15.11.24	No Lecture (EMNLP)
8	22.11.24	The Transformer is Turing (In)Complete	Mischa,Renne,Jesse	Hahn (2020)
10	29.11.24	The Transformer with Chain of Thought	Simon, Christian	Merril and Sabharwal (2024)
11	6.12.24	Circuit Complexity of The Transformer (I)	Aurelian, Erdem	Strobl et al. (2024) (Survey)
12	13.12.24	Circuit Complexity of The Transformer (II)	Raphael	Hao et al. (2022)
12	20.12.24	What can Attention Learn?	Meri	Yau et al. (2024)

Neural Networks and Computational Complexity

Course Description

Lecturer

Ryan Cotterell

Assistant Professor of Computer Science

ETH Zürich

Teaching Assistant Neural Networks and Computational Complexity

Clemente Pasti

PhD Student

ETH Zürich