The Foundations of Tokenization: Statistical and Computational Concerns