Generating Text from Language Models
ACL 2023
Tutorial Description
An increasingly large percentage of natural language processing (NLP) tasks center around the generation of text from probabilistic language models. Despite this trend, techniques for improving or specifying preferences in these generated texts rely mostly on intuition-based heuristics. Further, there lacks a unified presentation of their motivations, practical implementation, successes and pitfalls. Practitioners must, therefore, choose somewhat blindly between generation algorithms—like top-p sampling or beam search—which can lead to wildly different results. At the same time, language generation research continues to criticize and improve the standard toolboxes, further adding entropy to the state of the field. In this tutorial, we will provide a centralized and cohesive discussion of critical considerations when choosing how to generate from a language model. We will cover a wide range of empirically-observed problems (like degradation, hallucination, repetition) and their corresponding proposed algorithmic solutions from recent research (like top-p sampling and its successors). We will then discuss a subset of these algorithms under a unified light; most stochastic generation strategies can be framed as locally adapting the probabilities of a model to avoid failure cases. Finally, we will then cover methods in controlled generation, that go beyond just ensuring coherence to ensure text exhibits specific desired properties. We aim for NLP practitioners and researchers to leave our tutorial with a unified framework which they can use to evaluate and contribute to the latest research in language generation.
Slides
Slide deck will be continually updated.
Syllabus
Module 1 | Probability Distributions Over Strings | |
---|---|---|
Module 2 | Successes and Failures of Estimating Language Models | |
Module 3 | Basics of Generating Text | Colab Notebook |
Module 4 | Sampling Methods | Colab Notebook |
Module 5 | Controlled Generation | Colab Notebook |
Module 6 | Evaluating Language Generators | Colab Notebook |
Module 7 | Looking Forward... |
Suggested Literature
Probability distributions over strings
- Formal Aspects of Language Modeling (Cotterell et al., 2022)
- A Measure-Theoretic Characterization of Tight Language Models (Du et al., 2022)
- Calibration, Entropy Rates, and Memory in Language Models (Braverman et al., 2019)
A Mathematical Theory of Communication (Shannon, 1948)
Elements of Information Theory (Cover and Thomas, 2006)
Successes and Failures of Estimating Language Models
- Frequency Effects on Syntactic Rule Learning in Transformers (Wei et al., 2021)
Sampling Methods
Hierarchical Neural Story Generation (Fan et al., 2018)
The Curious Case of Neural Text Degeneration (Holtzman et al., 2020)
Mirostat: A Neural Text Decoding Algorithm that Directly Controls Perplexity (Basu et al., 2021)
Locally Typical Sampling (Meister et al., 2023)
Truncation Sampling as Language Model Desmoothing (Hewitt et al., 2022)
Contrastive Decoding: Open-ended Text Generation as Optimization (Li et al., 2022)
Trading Off Diversity and Quality in Natural Language Generation (Zhang et al., 2020)
Controlled Generation
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts (Liu et al., 2021)
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP (Schick et al., 2021)
FUDGE: Controlled Text Generation With Future Discriminators (Yang et al., 2021)
Gradient-based Constrained Sampling from Language Models (Kumar et al., 2022)
Structured Voronoi Sampling (Amini et al., 2023)
Measuring the quality of language generators and its challenges
Bleu: a Method for Automatic Evaluation of Machine Translation (Papineni et al., 2018)
BERTScore: Evaluating Text Generation with BERT (Zhang et al., 2020)
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers (Pillutla et al., 2021)
On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation (Pimentel et al., 2023)