Complex Segment Learner

Complex Segment Learner

Standard German (Indo-European)

German is not discussed in the paper, but in our taxonomy of case studies, it would fall in the "controversial/unclear" category. As reviewed in Richard Wiese's (2000) book, the analysis of the German consonant inventory has been debated since Trubetzkoy introduced his criteria for complex segments. The two affricates [ts] and [pf] are least controversial; [tʃ] and [dʒ] are more controversial. Wiese himself argues that [ps] and [ks] should be considered affricates, too.
Our learner finds no complex segments in German, although it should be noted that [ts] comes very close to the threshold in the simulation that uses morphemes as learning data. Since neither the arguments nor the results of our simulations are conclusive, we might have to wait for some behavioral/experimental evidence for a better understanding of the phonemic inventory of this well-studied language.

Simulation data at a glance

Click on simulation name to view additional simulation details.

Simulation nameInitial state Learning DataInitial state features
Celex Lemmas LearningData.txt Features.txt
Celex Morphemes LearningData.txt Features.txt
Schott LearningData.txt Features.txt
Celex Words LearningData.txt Features.txt

Simulation details for German celex lemmas

Input:

The lemma list comes from the German portion of Celex. The list is minimally filtered to exclude non-native sounds such as [θ].

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h
Outputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h

Simulation Plots

/media/german/celex/lemmas/simulation/insep_plots.png


Simulation details for German celex morphemes

Input:

This morpheme list was created by tokenizing the transcriptions with morpheme boundaries, available in the Celex lemma list for German. We simply split the transcribed lemmata on # and +, without any further filtering. See here for more.

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h
Outputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h

Simulation Plots

/media/german/celex/morphemes/simulation/insep_plots.png


Simulation details for German schott

Input:

This rather large word list is one of Kai Schott's IPA word lists created on the basis of OpenOffice spellchecker dictionaries. The list was minimally altered to broaden the transcription; scripts are here.

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h
Outputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h

Simulation Plots

/media/german/schott/simulation/insep_plots.png


Simulation details for German celex words

Input:

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h
Outputp b m f v t d s z n l ʃ ʒ ç χ ʀ j k g ŋ h

Simulation Plots

/media/german/celex/words/simulation/insep_plots.png