Home
News
People
Publications
Teaching
Thesis Projects
Reading Group
Talks
Joining
Jiaoda Li
Latest
A Transformer with Stack Attention
What Do Language Models Learn in Context? The Structured Task Hypothesis.
Probing via Prompting
Differentiable Subset Pruning of Transformer Heads
Cite
×