American English (Indo-European)

English is discussed in its own section in the paper, which reviews the arguments for its two purported affricates and discusses the data sources we used. Briefly, the arguments are not particularly strong, and the results of our simulations are correspondingly fragile. Finding [tʃ] requires narrow transcription of the stop portion. While this can be motivated, it is not necessary in other cases.

Simulation data at a glance

Click on simulation name to view additional simulation details.

Simulation name	Initial state Learning Data	Initial state features
Celex Broad	LearningData.txt	Features.txt
Cmu Narrow	LearningData.txt	Features.txt
Celex Narrow	LearningData.txt	Features.txt
Cmu Broad	LearningData.txt	Features.txt

Simulation details for English celex broad

Input:

Data source: Celex lemmas. The Celex transcriptions assume British English pronunciations.

LearningData.txt | Features.txt

Summary of iterations:

Iteration	Learning Data produced	Features produced	Inseparability	New Segments added	Segments removed
1	LearningData.txt	Features.txt	[download] [view]	dʒ	None
2	No new learning data	No new features	[download] [view]	None	None

Summary of inventory changes

Stage	Consonant set
Input	p b m f v θ ð t d s z n l ʃ ʒ ɹ j k g ŋ h w
Output	p b m f v θ ð t d s z n l ʃ ʒ ɹ j k g ŋ h w dʒ

Simulation Plots

/media/english/celex/broad/simulation/insep_plots.png

Simulation details for English cmu narrow

Input:

Same data as for "broad", same manipulation as for Celex except that an across-the-board search-and-replace was used for [t ʃ] and [d ʒ]. CMU does not have morpheme boundaries in any form.

LearningData.txt | Features.txt

Summary of iterations:

Iteration	Learning Data produced	Features produced	Inseparability	New Segments added	Segments removed
1	LearningData.txt	Features.txt	[download] [view]	cʃ, ɟʒ	None
2	No new learning data	No new features	[download] [view]	None	None

Summary of inventory changes

Stage	Consonant set
Input	p b m f v θ ð c ɟ t d s z n l ʃ ʒ ɹ j k g ŋ h w
Output	p b m f v θ ð c ɟ t d s z n l ʃ ʒ ɹ j k g ŋ h w cʃ ɟʒ

Simulation Plots

/media/english/cmu/narrow/simulation/insep_plots.png

Simulation details for English celex narrow

Input:

This is the same dataset as "broad", but with [t ʃ] and [d ʒ] replaced by [c ʃ] and [ɟ ʒ] respectively. Celex shows morpheme boundaries as syllabification in cases like "pot shot", so those [t ʃ] sequences were transcribed with alveolar first halves. See details here.

LearningData.txt | Features.txt

Summary of iterations:

Iteration	Learning Data produced	Features produced	Inseparability	New Segments added	Segments removed
1	LearningData.txt	Features.txt	[download] [view]	cʃ, ɟʒ	ɟ
2	No new learning data	No new features	[download] [view]	None	None

Summary of inventory changes

Stage	Consonant set
Input	p b m f v θ ð c ɟ t d s z n l ʃ ʒ ɹ j k g ŋ h w
Output	p b m f v θ ð c t d s z n l ʃ ʒ ɹ j k g ŋ h w cʃ ɟʒ

Simulation Plots

/media/english/celex/narrow/simulation/insep_plots.png

Simulation details for English cmu broad

Input:

Data source: Carnegie Mellon University pronunciation dictionary, as prepared by Bruce Hayes and Jamie White.

LearningData.txt | Features.txt

Summary of iterations:

Iteration	Learning Data produced	Features produced	Inseparability	New Segments added	Segments removed
1	LearningData.txt	Features.txt	[download] [view]	dʒ	None
2	No new learning data	No new features	[download] [view]	None	None

Summary of inventory changes

Stage	Consonant set
Input	p b m f v θ ð t d s z n l ʃ ʒ ɹ j k g ŋ h w
Output	p b m f v θ ð t d s z n l ʃ ʒ ɹ j k g ŋ h w dʒ

Simulation Plots

/media/english/cmu/broad/simulation/insep_plots.png

Complex Segment Learner

American English (Indo-European)

Simulation data at a glance

Simulation details for English celex broad

Input:

LearningData.txt | Features.txt

Summary of iterations:

Summary of inventory changes

Simulation Plots

Simulation details for English cmu narrow

Input:

LearningData.txt | Features.txt

Summary of iterations:

Summary of inventory changes

Simulation Plots

Simulation details for English celex narrow

Input:

LearningData.txt | Features.txt

Summary of iterations:

Summary of inventory changes

Simulation Plots

Simulation details for English cmu broad

Input:

LearningData.txt | Features.txt

Summary of iterations:

Summary of inventory changes

Simulation Plots