Language Models over Canonical Byte-Pair Encodings
Tim Vieira, Tianyu Liu, Clemente Pasti, Yahya Emara, Brian DuSell, Benjamin LeBrun, Mario Giulianelli, Juan Luis Gastaldi, John Terilla, Timothy J. O'Donnell, Ryan Cotterell
January 2025
Publication
Proceedings of the 42nd International Conference on Machine Learning
Add the full text or supplementary notes for the publication here using Markdown formatting.