Studying the learning of lexical specificity in phonology requires a model of what is to be learned:
How is lexical arbitrariness represented in the phonological lexicon and grammar?
Are patterns with lexical conditioning even represented in the grammar, or in a separate (e.g. analogical) system?
Given that the talks in this symposium posit single system probabilistic grammar models that handle the full continuum of regularity, I’ll focus on question 1., and related challenges that arise for learning (see Pinker and colleagues on dual system models of morphophonology).
A popular approach to lexical conditioning, but one that doesn’t seem to be represented in today’s talks, is to assume that all differences in the behavior of morphemes are represented as differences in the structure of their URs.
For example, the difference between penultimately stressed banana and initially stressed Canada in English (see recently Moore-Cantwell 2015: diss.):
Or for a final voiced obstruent that fails to undergo a general rule of final devoicing (e.g. Inkelas, Orgun and Zoll 1997 on Turkish):
Even with this simple model of lexical influence on phonology, taking lexical specificity into account is not trivial:
Dresher and Kaye’s (1990) pioneering study of metrical parameter setting abstracts from exceptions (they cause problems for their QS trigger).
See Tesar's work (2006: Cognitive Science et seq.) for proposals on how to find abstract URs that produce morpheme-specific behavior
Most phonologists would agree that Answer 1 is not sufficient: at least some lexical arbitrariness needs to be handled by lexically listing allomorphs, for example English a/an, which has an idiosyncratic alternation (see recently Smith 2016: diss.).
A related method of achieving arbitrariness in alternation is to allow derived words to be stored whole (see e.g. Zuraw 2000: diss. on Tagalog)
This immediately enlarges the learning space, in a way that’s usually not confronted: how does a learner know when to adopt the listed allomorph analysis as opposed to a grammatical or single UR encoding? Kager (2008) points out that there is considerable overlap between what underspecification and multiple URs can do, and this is also true w.r.t. lexically specific constraints
The only approach I know to multiple UR learning assumes that URs are themselves rankable constraints (Smith 2016).
Many phonologists would agree that Answers 1 and 2 together are still not sufficient: at least some lexical arbitrariness is encoded in terms of lexically specific rules or constraints, or co-grammars (see e.g. Becker, Ketrez and Nevins 2011 on Turkish final voicing). For example:
One approach to learning lexical specific constraints might be to follow Tesar and colleagues on UR learning in using inconsistency detection (see Pater 2010 for a sketch). However, this sort of approach has not yet been shown to be compatible with within-morpheme variation, another type of inconsistency. Better answers appear in Becker, Moore-Cantwell, and Shih’s talks today (see also Nazarov 2016: diss.).
Learning with multiple overlapping approaches to lexical specificity
Gradient URs. In addition to Moore-Cantwell (2015, today), see Coetzee and Pater (2011: Handbook) on the empirical phenomenon of “lexically conditioned variation” and overview of (potential) approaches, and Smolensky and Goldrick (2016) for other arguments for gradient URs, and a connectionist proposal. Tricky and interesting issues w.r.t. the role of frequency - see Coetzee & Kawahara (2012: NLLT)
Learning and historical change. Do our models produce patterns of regularization that correspond to observed changes? How do they behave in Agent-Based Modeling approaches to change?