Bolivian/Peru Quechua (Quechuan)
Quechua is the discussed in the paper, and the one that most clearly highlights the problem of the nature of the learning data. As described in the paper, the simulations on word corpora fail in various ways to arrive at the posited (uncontroversial) inventory of affricates. The reason is that the phonotactics and morphology of Quechua conspire to inflate the frequencies of certain clusters, diluting the frequencies of affricates. Reasonable-looking results arrive only when the learner is trained on more abstract data: roots or a morpheme list. Quechua is also a case where we tried to decompose aspirated and ejective plosives into more primitive parts. The learner does not find these segments when trained on words.
In addition to various kinds of "one-word-per-line" datasets, we trained the learner on a corpus of connected, child-directed speech. This is for a different dialect, Peruvian Quechua, but it is sufficiently close to Bolivian Quechua to draw some conclusions. The learner does not over-unify insane clusters in this simulation, but it also does not find the aspirated affricate--its inseparability value trails clusters that are frequent in common affixes, so the learner is unlikely to find it in such data.
Simulation data at a glance
Click on simulation name to view additional simulation details.
Simulation name | Initial state Learning Data | Initial state features |
---|---|---|
Words Narrow | LearningData.txt | Features.txt |
Roots | LearningData.txt | Features.txt |
Morphemes Broad | LearningData.txt | Features.txt |
Morphemes Narrow | LearningData.txt | Features.txt |
Words_With_Mb | LearningData.txt | Features.txt |
Words Glot_Feats_As_Segs | LearningData.txt | Features.txt |
Words Broad | LearningData.txt | Features.txt |
Childes_Cds | LearningData.txt | Features.txt |
Simulation details for Quechua words narrow
Input:
This is the same word list as "words broad", but with uvular retraction transcribed on sonorants.
LearningData.txt | Features.txt
Summary of iterations:
Iteration | Learning Data produced | Features produced | Inseparability | New Segments added | Segments removed |
---|---|---|---|---|---|
1 | LearningData.txt | Features.txt | [download] [view] | tʃ, jk, ŋk, ɴq, r̞q | None |
2 | LearningData.txt | Features.txt | [download] [view] | tʃ', sq, nt, ɲtʃ, ŋk', r̞qʰ, ʎ̞q, j̠̠q | ʃ', ʎ̞ |
3 | LearningData.txt | Features.txt | [download] [view] | tʃʰ, xt, mp, jt, ɴqʰ, r̞q', j̠̠q' | ʃʰ, r̞, j̠̠ |
4 | LearningData.txt | Features.txt | [download] [view] | sp, rp, ʎp' | None |
5 | LearningData.txt | Features.txt | [download] [view] | rl | None |
6 | No new learning data | No new features | [download] [view] | None | None |
6 | No new learning data | No new features | [download] [view] | None | None |