On the Intersection of Context-Free and Regular Languages


The Bar-Hillel construction is a classic result in formal language theory. It shows, by construction, that the intersection between a context-free language and a regular language is itself context-free. However, neither its original formulation (Bar-Hillel et al., 1961) nor its weighted extension (Nederhof and Satta, 2003) can handle automata with ϵ-arcs. In this short note, we generalize the Bar-Hillel construction to correctly compute the intersection even when the automaton contains ϵ-arcs. We further prove that our generalized construction leads to a grammar that encodes the structure of both the input automaton and grammar while retaining the asymptotic size of the original construction.

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics