Department of Linguistics
University of Pennsylvania
A geometric framework and novel algorithm for Variational Learning
The framing of competing grammars within the individual speaker, argued for in Kroch (1989) and Santorini (1992), has been applied to language acquisition through the class of Variational Learning models, introduced to ranked-constraint-spaces in the Stochastic OT model of Boersma (1997) and to parameter-spaces in Yang (2002). In the past two decades, these models have been argued against, and newer models have been built with significantly increased computational power (e.g., Fodor and Sakas 2004; Jarosz 2015; Nazarov and Jarosz 2017; Prickett et al. 2019); perhaps to degrees that are cognitively implausible.
I make two central claims. First, learning is still possible under the constraint of “bandit feedback”, in which the learner receives only a single bit of information from each input utterance, even in rich, realistic domains with probabilistic targets. Second, interpreting parameter-spaces and ordered-constraint-spaces geometrically allows for a deeper understanding of Variational Learners and for the definition of algorithms that can apply across representational frameworks (here, parameters and ordered constraints). These two claims are supported by a novel learning algorithm that generalizes Yang (2002), and which has a natural interpretation relating to the center-of-mass of an entropy-maximizing probability distribution.