Publications

(2025). A Close Analysis of the Subset Construction. Conference on Foundations of Software Technology and Theoretical Computer Science.

URL

(2025). Investigating Critical Period Effects in Language Acquisition through Neural Language Models. Transactions of the Association for Computational Linguistics.

URL

(2025). Taxonomy-Aware Evaluation of Vision--Language Models. Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR).

URL

(2025). Variational Best-of-$N$ Alignment. Proceedings of the 11th International Conference on Learning Representations.

URL

(2025). Unique Hard Attention: A Tale of Two Sides. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2025). Training Neural Networks as Recognizers of Formal Languages. Proceedings of the 11th International Conference on Learning Representations.

URL

(2025). The Harmonic Structure of Information Contours. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2025). The Foundations of Tokenization: Statistical and Computational Concerns. Proceedings of the 11th International Conference on Learning Representations.

URL

(2025). Syntactic Control of Language Models by Posterior Inference. Findings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2025). Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo. Proceedings of the 13th International Conference on Learning Representations.

URL

(2025). Probability Distributions Computed by Hard-Attention Transformers.

(2025). Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation. Proceedings of the 2025 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers).

URL

(2025). On the challenges and opportunities in generative AI. arXiv.

URL

(2025). Language Models over Canonical Byte-Pair Encodings. Proceedings of the 42nd International Conference on Machine Learning.

URL

(2025). Information Locality as an Inductive Bias for Neural Language Models. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2025). Incremental Alternative Sampling as a Lens into the Temporal and Representational Resolution of Linguistic Prediction. PsyArXiv.

URL

(2025). How Persuasive is Your Context?.

URL

(2025). Gumbel Counterfactual Generation from Language Models. Proceedings of the 11th International Conference on Learning Representations.

URL

(2025). Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling.

URL

(2025). Controllable Context Sensitivity and the Knob Behind It. Proceedings of the 11th International Conference on Learning Representations.

URL

(2025). Can Language Models Learn Typologically Implausible Languages?.

URL

(2025). Bigger is not always better: The importance of human-scale language modeling for psycholinguistics.

(2025). Better Estimation of the KL Divergence Between Language Models.

URL

(2025). Are Language Models Efficient Reasoners? A Perspective from Logic Programming. The Thirty-ninth Annual Conference on Neural Information Processing Systems.

URL

(2025). A Spatio-Temporal Point Process for Fine-Grained Modeling of Reading Behavior. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2025). A Practical Method for Generating String Counterfactuals. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers).

URL

(2025). A Distributional Perspective on Word Learning in Neural Language Models. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers).

URL

(2024). On Affine Homotopy between Language Encoders. Advances in Neural Information Processing Systems 38 (2024).

URL

(2024). Surprise! Uniform Information Density Isn't the Whole Story: Predicting Surprisal Contours in Long-form Discourse. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

URL

(2024). Reverse-Engineering the Reader. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

URL

(2024). On The Role of Context in Reading Time Prediction. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

URL

(2024). On the Proper Treatment of the Word in Computational Psycholinguistics. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

URL

(2024). Generalized Measures of Responsive and Anticipatory Language Processing. Findings of the Association for Computational Linguistics: EMNLP 2024.

URL

(2024). Efficiently Computing Susceptibility to Context in Language Models. Findings of the Association for Computational Linguistics: EMNLP 2024.

URL

(2024). Can Transformer Language Models Learn $n$-gram Language Models?. Findings of the Association for Computational Linguistics: EMNLP 2024.

URL

(2024). An L* Algorithm for Deterministic Weighted Automata. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

(2024). Activation Scaling for Steering and Interpreting Language Models. Findings of the Association for Computational Linguistics: EMNLP 2024.

URL

(2024). A Probability--Quality Trade-off in Aligned Language Models and its Relation to Sampling Adaptors. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.

URL

(2024). On Efficiently Representing Regular Languages as RNNs. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2024). Representation Surgery: Theory and Practice of Affine Steering. international conference of machine laerning - Google Search.

URL

(2024). Principled Gradient-Based MCMC for Conditional Sampling of Text. Proceedings of the 41st International Conference on Machine Learning.

URL

(2024). Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?. Proceedings of the 41st International Conference on Machine Learning.

URL

(2024). Transformers Can Represent n-gram Language Models. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers).

URL

(2024). Correlation Does Not Imply Compensation: Complexity and Irregularity in the Lexicon. Proceedings of the Society for Computation in Linguistics 2024.

URL

(2024). Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns. The Twelfth International Conference on Learning Representations.

URL

(2024). PILA: A Historical-Linguistic Dataset of Proto-Italic and Latin. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024).

URL

(2024). An information-theoretic analysis of targeted regressions during reading. Cognition.

URL

(2024). When is a Language Process a Language Model?. Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2024). What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2024). What Do Language Models Learn in Context? The Structured Task Hypothesis.. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2024). Towards Explainability in Legal Outcome Prediction Models. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers).

URL

(2024). The Role of $n$-gram Smoothing in the Age of Neural Networks. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2024). The Foundations of Tokenization: Statistical and Computational Concerns.

(2024). The Ethics of Automating Legal Actors. Transactions of the Association for Computational Linguistics.

PDF

(2024). On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2024). NARRATE: Versatile Language Architecture for Optimal Control in Robotics. arXiv.

URL

(2024). Lower Bounds on the Expressivity of Recurrent Neural Language Models. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers).

URL

(2024). Large-scale evidence for logarithmic effects of word predictability on reading time. Proceedings of the National Academy of Sciences of the United States of America.

(2024). Grammatical Gender's Influence on Distributional Semantics: A Causal Perspective. Transactions of the Association for Computational Linguistics.

URL

(2024). Direct Preference Optimization with an Offset. Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2024). Context versus Prior Knowledge in Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2024). A Transformer with Stack Attention. Findings of the Association for Computational Linguistics: NAACL 2024.

URL

(2023). The Ethics of Automating Legal Actors. Association for Computational Linguistics.

PDF

(2023). Structured Voronoi Sampling. Advances in Neural Information Processing Systems 36 (2024).

URL

(2023). Revisiting the Optimality of Word Lengths. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

URL

(2023). Recurrent Neural Language Models as Probabilistic Finite-state Automata. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

URL

(2023). Quantifying the redundancy between prosody and text. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

PDF

(2023). On the Representational Capacity of Recurrent Neural Language Models. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

URL

(2023). On the Optimality of Word Lengths. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

(2023). Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

PDF

(2023). Language Model Quality Correlates with Psychometric Predictive Power in Multiple Languages. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

(2023). Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora. Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning.

(2023). Efficient Algorithms for Recognizing Weighted Tree-Adjoining Languages. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

URL

(2023). An Exploration of Left-Corner Transformations. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

URL

(2023). Tokenization and the Noiseless Channel. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Testing the Predictions of Surprisal Theory in 11 Languages. Transactions of the Association for Computational Linguistics.

URL

(2023). On the Efficacy of Sampling Adapters. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). On the Effect of Anticipation on Reading Times. Transactions of the Association for Computational Linguistics.

URL

(2023). Naturalistic Causal Probing for Morpho-Syntax. Transactions of the Association for Computational Linguistics.

URL

(2023). Naturalistic Causal Probing for Morpho-Syntax. Transactions of the Association for Computational Linguistics.

URL

(2023). Log-Linear Guardedness and Its Implications. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Locally Typical Sampling. Transactions of the Association for Computational Linguistics.

URL

(2023). Locally Typical Sampling. Transactions of the Association for Computational Linguistics.

URL

(2023). Hexatagging: Projective Dependency Parsing as Tagging. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).

URL

(2023). Generalizing Backpropagation for Gradient-Based Interpretability. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Generalizing Backpropagation for Gradient-Based Interpretability. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Efficient Semiring-Weighted Earley Parsing. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Efficient Semiring-Weighted Earley Parsing. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Discourse-Centric Evaluation of Document-level Machine Translation with a New Densely Annotated Parallel Corpus of Novels. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Discourse-Centric Evaluation of Document-level Machine Translation with a New Densely Annotated Parallel Corpus of Novels. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). Convergence and Diversity in the Control Hierarchy. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). An Ordinal Latent Variable Model of Conflict Intensity. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

PDF

(2023). A Measure-theoretic Characterization of Tight Language Model. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2023). A Formal Perspective on Byte-Pair Encoding. Findings of the Association for Computational Linguistics: ACL 2023.

URL

(2023). A Fast Algorithm for Computing Prefix Probabilities. Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).

URL

(2023). Sentiment as an Ordinal Latent Variable. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics.

URL

(2023). On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation. Proceedings of the 11th International Conference on Learning Representations.

URL

(2023). On the Intersection of Context-Free and Regular Languages. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics.

URL

(2023). The Ordered Matrix Dirichlet for State-Space Models. Proceedings of the 26th International Conference on Artificial Intelligence and Statistics.

URL

(2023). Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models. PLOS One.

URL

(2023). On the Role of Negative Precedent in Legal Outcome Prediction. Transactions of the Association for Computational Linguistics.

URL

(2023). On the Effect of Anticipation on Reading Times. Transactions of the Association for Computational Linguistics.

URL

(2023). LEACE: Perfect linear concept erasure in closed form. Advances in Neural Information Processing Systems 36 (2024)..

URL

(2023). Controlled Text Generation with Natural Language Instructions. Proceedings of the 39th International Conference on Machine Learning.

URL

(2023). A Latent-Variable Model for Intrinsic Probing. Proceedings of the 37th AAAI Conference on Artificial Intelligence.

URL

(2023). A Cross-Linguistic Pressure for Uniform Information Density in Word Order. Transactions of the Association for Computational Linguistics.

URL

(2022). The Architectural Bottleneck Principle. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.

URL

(2022). On Parsing as Tagging. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.

URL

(2022). Mutual Information and Hallucinations in Abstractive Summarization. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.

URL

(2022). Kernelized Concept Erasure. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.

URL

(2022). Autoregressive Structure Prediction with Language Models. Findings of the Association for Computational Linguistics: EMNL 2022.

URL

(2022). Algorithms for Weighted Pushdown Automata. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.

URL

(2022). Algorithms for Weighted Finite-State Automata with Failure Arcs. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.

URL

(2022). Equivariant Transduction through Invariant Alignment. Proceedings of the 29th International Conference on Computational Linguistics.

URL

(2022). Benchmarking Compositionality with Formal Languages. Proceedings of the 29th International Conference on Computational Linguistics.

URL

(2022). The SIGTYP 2022 Shared Task on the Prediction of Cognate Reflexes. Proceedings of the 4th Workshop on Research in Computational Linguistic Typology and Multilingual NLP.

(2022). The SIGMORPHON 2022 Shared Task on Morpheme Segmentation. Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology.

(2022). SIGMORPHON--UniMorph 2022 Shared Task 0: Generalization and Typologically Diverse Morphological Inflection. Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology.

(2022). Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2022). Probing via Prompting. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2022). On the Machine Learning of Ethical Judgments from Natural Language. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2022). Linear Adversarial Concept Erasure. Proceedings of the 39th International Conference on Machine Learning.

URL

(2022). Exact Paired-Permutation Testing for Structured Test Statistics. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2022). BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2022). A Structured Span Selector. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2022). Probing for the Usage of Grammatical Number. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2022). Probing as Quantifying the Inductive Bias of Pre-trained Representations. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2022). On the probability-quality paradox in language generation. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2022). Estimating the Entropy of Linguistic Distributions. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2022). Analyzing Wrap-Up Effects through an Information-Theoretic Lens. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

URL

(2022). Visual Comparison of Language Model Adaptation. IEEE Visualization.

URL

(2022). State-of-the-art generalisation research in NLP: a taxonomy and review. arXiv.

URL

(2022). State-of-the-art generalisation research in NLP: a taxonomy and review. arXiv.

URL

(2022). On the Intersection of Context-Free and Regular Languages. arXiv.

PDF URL

(2022). On Decoding Strategies for Neural Text Generators. Transactions of the Association for Computational Linguistics.

URL

(2022). Cluster-based Evaluation of Automatically Generated Text. arXiv.

PDF URL

(2022). An Ordinal Latent Variable Model of Conflict Intensity. arXiv.

PDF URL

(2021). Text or Topology? Classifying Ally-Enemy Pairs in Militarised Conflict. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

(2021). Searching for More Efficient Dynamic Programs. Findings of the Association for Computational Linguistics: EMNLP 2021.

URL

(2021). Searching for More Efficient Dynamic Programs. Findings of the Association for Computational Linguistics: EMNLP 2021.

(2021). Revisiting the Uniform Information Density Hypothesis. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

URL

(2021). Revisiting the Uniform Information Density Hypothesis. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

(2021). Phone-level Uniform Information Density across and within Languages. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

(2021). On Homophony and Rényi Entropy. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

URL

(2021). On Homophony and Rényi Entropy. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

(2021). Keyword2Text: A Plug-and-Play Method for Controlled Text Generation. Findings of the Association for Computational Linguistics: EMNLP 2021.

(2021). Equivariant Transduction through Invariant Alignment. Findings of the Association for Computational Linguistics: EMNLP 2021.

(2021). Efficient Sampling of Dependency Structure. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

URL

(2021). Efficient Sampling of Dependency Structure. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

(2021). Conditional Poisson Stochastic Beams. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

(2021). Conditional Poisson Stochastic Beam Search. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

URL

(2021). Classifying Dyads for Militarized Conflict Analysis. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

URL

(2021). Adjusting the Conflict-Cooperation Scale for Armed Conflict Assessment. Findings of the Association for Computational Linguistics: EMNLP 2021.

(2021). A surprisal--duration trade-off across and within the world’s languages. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

URL

(2021). A Plug-and-Play Method for Controlled Text Generation. Findings of the Association for Computational Linguistics: EMNLP 2021.

URL

(2021). A Bayesian Framework for Information-Theoretic Probing. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

(2021). A Bayesian Framework for Information-Theoretic Probing. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.

URL

(2021). SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages. Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology.

(2021). On Finding the $K$-best Non-projective Dependency Trees. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

URL

(2021). On Finding the $K$-best Non-projective Dependency Trees. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

PDF

(2021). Modelling the Unigram Distribution. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.

URL

(2021). Modeling the Unigram Distribution. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.

URL

(2021). Language Model Evaluation Beyond Perplexity. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

URL

(2021). Language Model Evaluation Beyond Perplexity. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

PDF

(2021). Is Sparse Attention more Interpretable?. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).

URL

(2021). Is Sparse Attention more Interpretable?. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).

PDF

(2021). Higher-order Derivatives of Weighted Finite-state Machines. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).

URL

(2021). Higher-order Derivatives of Weighted Finite-state Machines. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).

PDF

(2021). Examining the Inductive Bias of Neural Language Models with Artificial Languages. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

URL

(2021). Examining the Inductive Bias of Neural Language Models with Artificial Languages. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

URL

(2021). Determinantal Beam Search. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

URL

(2021). Determinantal Beam Search. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

PDF

(2021). A cognitive regularizer for language modeling. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

URL

(2021). A cognitive regularizer for language modeling. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing (Volume 1: Long Papers).

PDF

(2021). What About the Precedent: An Information-Theoretic Analysis of Common Law. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2021). What About the Precedent: An Information-Theoretic Analysis of Common Law. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

PDF

(2021). SIGTYP 2021 Shared Task: Robust Spoken Language Identification. Proceedings of the Third Workshop on Computational Typology and Multilingual NLP.

(2021). How (Non-)Optimal is the Lexicon?. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2021). How (Non-)Optimal is the Lexicon?. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2021). Finding Concept-specific Biases in Form--Meaning Associations. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2021). Finding Concept-specific Biases in Form--Meaning Associations. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2021). Do Syntactic Probes Probe Syntax? Experiments with Jabberwocky Probing. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2021). Do Syntactic Probes Probe Syntax? Experiments with Jabberwocky Probing. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

PDF

(2021). A Non-Linear Structural Probe. NAACL.

PDF Anthology arXiv

(2021). A Non-Linear Structural Probe. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2021). Searching for Search Errors in Neural Morphological Inflection. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.

URL

(2021). Searching for Search Errors in Neural Morphological Inflection. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.

URL

(2021). Disambiguatory signals are stronger in word initial positions. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.

URL

(2021). Applying the Transformer to Character-level Transduction. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.

URL

(2021). Applying the Transformer to Character-level Transduction. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.

PDF

(2021). Parameter Space Factorization for Zero-Shot Learning across Tasks and Languages. Transactions of the Association for Computational Linguistics.

URL

(2021). On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs. Transactions of the Association for Computational Linguistics.

URL

(2021). Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs. Transactions of the Association for Computational Linguistics.

URL

(2021). Efficient Computation of Expectations under Spanning Tree Distributions. Transactions of the Association for Computational Linguistics.

URL

(2021). Differentiable Subset Pruning of Transformer Heads. Transactions of the Association for Computational Linguistics.

URL

(2021). A Word on Machine Ethics: A Response to Jiang et al. (2021).

PDF

(2020). Morphologically Aware Word-Level Translation. Proceedings of the 28th International Conference on Computational Linguistics.

URL

(2020). Speakers Fill Semantic Gaps with Context. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

URL

(2020). SIGTYP 2020 Shared Task: Prediction of Typological Features. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

URL

(2020). Please Mind the Root: Decoding Arborescences for Dependency Parsing. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

URL

(2020). Pareto Probing: Trading Off Accuracy for Simplicity. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

URL

(2020). Measuring the Similarity of Grammatical Gender Systems by Comparing Partitions. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

(2020). Investigating Cross-Linguistic Adjective Ordering Tendencies with a Latent-Variable Model. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

URL

(2020). Intrinsic Probing through Dimension Selection. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

URL

(2020). Intrinsic Probing through Dimension Selection. EMNLP.

PDF Code URL

(2020). If Beam Search is the Answer, What was the Question?. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

URL

(2020). Finding Concept-specific Biases in Form–Meaning Associations. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

(2020). Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

PDF

(2020). Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.

(2020). The Paradigm Discovery Problem. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). Predicting Declension Class from Form and Meaning. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). Metaphor Detection Using Context and Concreteness. Proceedings of the Second Workshop on Figurative Language Processing.

(2020). It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). Information-Theoretic Probing for Linguistic Structure. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). A Tale of a Probe and a Parser. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). A Corpus for Large-Scale Phonetic Typology. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.

URL

(2020). UniMorph 3.0: Universal Morphology. Proceedings of the Twelfth International Conference on Language Resources and Evaluation.

(2020). SIGMORPHON 2020 Task 0 System Description: ETH Zürich Team. Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology.

(2020). Phonotactic Complexity and its Trade-offs. Transactions of the Association for Computational Linguistics.

URL

(2020). Parameter Space Factorization for Zero-Shot Learning across Tasks and Languages. Transactions of the Association for Computational Linguistics.

(2020). On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs. Transactions of the Association for Computational Linguistics.

URL

(2020). Efficient Computation of Expectations under Spanning Tree Distributions. Transactions of the Association for Computational Linguistics.

URL

(2020). Best-First Beam Search. Transactions of the Association for Computational Linguistics.

URL

(2019). Towards Zero-Shot Language Modeling. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.

URL

(2019). Quantifying the Semantic Core of Gender Systems. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.

URL

(2019). It’s All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.

URL

(2019). Examining Gender Bias in Languages with Grammatical Gender. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.

URL

(2019). Don’t Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.

URL

(2019). The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection. Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology.

URL

(2019). What Kind of Language Is Hard to Language-Model?. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

URL

(2019). Unsupervised Discovery of Gendered Language through Latent-Variable Modeling. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

URL

(2019). Uncovering Typological Implications with Belief Nets. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

URL

(2019). On the distribution of deep clausal embeddings: A large cross-linguistic study. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

(2019). Measuring Morphological Irregularity. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

URL

(2019). Meaning to Form: Measuring Systematicity as Information. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

URL

(2019). Exact Hard Monotonic Attention for Character-Level Transduction. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

URL

(2019). Counterfactual Data Augmentation for Mitigating Gender Bias in Languages with Rich Morphology. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

URL

(2019). On the Idiosyncrasies of the Mandarin Chinese Classifier System. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2019). Gender Bias in Contextualized Word Embeddings. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2019). Contextualization of Morphological Inflection. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2019). Combining Sentiment Lexica with a Multi-View Variational Autoencoder. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2019). A Simple Joint Model for Improved Contextual Neural Lemmatization. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2019). A Probabilistic Generative Model of Linguistic Typology. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2019). On the Complexity and Typology of Inflectional Morphological Systems. Transactions of the Association for Computational Linguistics.

URL

(2019). Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate. Transactions of the Association for Computational Linguistics.

PDF

(2018). The CoNLL--SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection. Proceedings of the CoNLL–SIGMORPHON 2018 Shared Task: Universal Morphological Reinflection.

URL

(2018). Marrying Universal Dependencies and Universal Morphology. Proceedings of the Second Workshop on Universal Dependencies.

URL

(2018). Hard Non-Monotonic Attention for Character-Level Transduction. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.

URL

(2018). Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction. Proceedings of the 22nd Conference on Computational Natural Language Learning.

URL

(2018). A Discriminative Latent-Variable Model for Bilingual Lexicon Induction. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.

URL

(2018). A Structured Variational Autoencoder for Contextual Morphological Inflection. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.

URL

(2018). Unsupervised Disambiguation of Syncretism in Inflected Lexicons. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2018). Are All Languages Equally Hard to Language-Model?. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2018). A Deep Generative Model of Vowel Formant Typology. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2018). UniMorph 2.0: Universal Morphology. Proceedings of the Eleventh International Conference on Language Resources and Evaluation.

URL

(2018). Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate. Transactions of the Association for Computational Linguistics.

URL

(2018). On the Diachronic Stability of Irregularity in Inflectional Morphology. arXiv preprint arXiv:1804.08262.

URL

(2018). Joint Semantic Synthesis and Morphological Analysis of the Derived Word. Transactions of the Association for Computational Linguistics.

URL

(2018). Explaining and Generalizing Back-Translation through Wake-Sleep. arXiv preprint arXiv:1806.04402.

URL

(2017). Low-Resource Named Entity Recognition with Cross-lingual, Character-Level Neural Conditional Random Fields. Proceedings of the Eighth International Joint Conference on Natural Language Processing.

(2017). Paradigm Completion for Derivational Morphology. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.

(2017). Cross-lingual, Character-Level Neural Morphological Tagging. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.

URL

(2017). Probabilistic Typology: Deep Generative Models of Vowel Inventories. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.

URL

(2017). One-Shot Neural Cross-Lingual Transfer for Paradigm Completion. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.

URL

(2017). Frame-Based Continuous Lexical Semantics through Exponential Family Tensor Factorization and Semantic Proto-Roles. Proceedings of the 6th Joint Conference on Lexical and Computational Semantics.

(2017). Frame-Based Continuous Lexical Semantics through Exponential Family Tensor Factorization and Semantic Proto-Roles. Proceedings of the 6th Joint Conference on Lexical and Computational Semantics.

PDF

(2017). CoNLL--SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection in 52 Languages. Proceedings of the CoNLL SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection.

(2017). Neural Multi-Source Morphological Reinflection. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.

URL

(2017). Neural Graphical Models over Strings for Principal Parts Morphological Paradigm Completion. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.

(2017). Morphological Analysis of the Dravidian Language Family. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.

(2017). Explaining and Generalizing Skip-Gram through Exponential Family Principal Component Analysis. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.

(2017). Context-Aware Prediction of Derivational Word-forms. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.

(2017). A Rich Morphological Tagger for English: Exploring the Cross-Linguistic Tradeoff Between Morphology and Syntax. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.

(2016). Speed-Accuracy Tradeoffs in Tagging with Variable-Order CRFs and Structured Sparsity. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.

(2016). Neural Morphological Analysis: Encoding-Decoding Canonical Segments. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.

(2016). Morphological Segmentation Inside-Out. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.

URL

(2016). The SIGMORPHON 2016 Shared Task—Morphological Reinflection. Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology.

(2016). Morphological Smoothing and Extrapolation of Word Embeddings. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.

(2016). Weighting Finite-State Transductions With Neural Context. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

(2016). A Joint Model of Orthography and Morphological Segmentation. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

(2016). Contrastive Morphological Typology and Logical Hierarchies. Proceedings of the 52nd Annual Meeting of the Chicago Linguistic Society.

(2016). Analysis of Morphology in Topic Modeling. arXiv preprint arXiv:1608.03995.

(2015). Joint Lemmatization and Morphological Tagging with Lemming. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.

(2015). Joint Lemmatization and Morphological Tagging with Lemming. EMNLP.

PDF Anthology

(2015). Dual Decomposition Inference for Graphical Models over Strings. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.

(2015). Penalized Expectation Propagation for Graphical Models over Strings. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

(2015). Morphological Word Embeddings. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2015). Labeled Morphological Segmentation with Semi-Markov Models. Proceedings of the Nineteenth Conference on Computational Natural Language Learning.

(2015). Modeling Word Forms Using Latent Underlying Morphs and Phonology. Transactions of the Association for Computational Linguistics.

(2014). Stochastic Contextual Edit Distance and Probabilistic FSTs. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.

(2014). Translation of the CALLHOME Egyptian Arabic Corpus For Conversational Speech Translation. Proceedings of the 14th International Conference on Spoken Language Translation.

(2014). An Algerian Arabic-French Code-Switched Corpus. Proceedings of the First Workshop on Free/Open-Source Arabic Corpora and Corpora Processing Tools.

(2014). A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic. Proceedings of the Nineth International Conference on Language Resources and Evaluation.