Complex Segment Learner

Complex Segment Learner

Classical / Pseudo-Italian Latin (Indo-European)

Latin has its own section in the paper, which reviews the arguments for [kw] and [gw] as complex segments. We find these arguments to be dubious. The "Affricated" version of the Latin dataset was created to test the hypothesis that complex segments have the distributions of singleton consonants because they come from singletons affected by systematic sound change. This simulation applies palatalization sound changes to Latin to create "pseudo-Italian".

Simulation data at a glance

Click on simulation name to view additional simulation details.

Simulation nameInitial state Learning DataInitial state features
Whitaker Affricated LearningData.txt Features.txt
Whitaker Classical LearningData.txt Features.txt
Paradigms Classical LearningData.txt Features.txt
Lewis-Short Affricated LearningData.txt Features.txt
Lewis-Short Classical LearningData.txt Features.txt
Paradigms Affricated LearningData.txt Features.txt

Simulation details for Latin whitaker affricated

Input:

This 84,000+ word list comes from Whitaker's dictionary of Latin, widely available online. It was converted to IPA from orthography by substituting geminates with singleton symbols. Vowel length is not marked in this dictionary so it is not represented in the data file.

This particular version of the dataset tests a hypothesis about the changing frequency distributions that come with sound change. In the history of Latin, palatalization affected [t, d, k, g] before front vowels and glides. This resulted in a variety of changes--for example, in modern Italian, [ts, dz, tʃ, dʒ] originated from Latin stops in those invironments (simplifying somewhat). We took our basic Latin dictionary and applied the changes, rewriting [t i] with [t s i], [k i] with [t ʃ i], and so on (just before [i]). The result is that the learner unified all four sequences into affricates

.

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 LearningData.txt Features.txt [download] [view] tʃ, dz, dʒ z, ʃ, ʒ
2 LearningData.txt Features.txt [download] [view] ts None
3 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp P b B m M f F w k K g G t T d D s z S ʃ ʒ n N r R l L j h
Outputp P b B m M f F w k K g G t T d D s S n N r R l L j h tʃ dz dʒ ts

Simulation Plots

/media/latin/whitaker/affricated/simulation/insep_plots.png


Simulation details for Latin whitaker classical

Input:

This 84,000+ word list comes from Whitaker's dictionary of Latin, widely available online. It was converted to IPA from orthography by substituting geminates with singleton symbols. Vowel length is not marked in this dictionary so it is not represented in the data file.

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp P b B m M f F w k K g G t T d D s S n N r R l L j h
Outputp P b B m M f F w k K g G t T d D s S n N r R l L j h

Simulation Plots

/media/latin/whitaker/classical/simulation/insep_plots.png


Simulation details for Latin paradigms classical

Input:

This dataset is a "flattened" paradigm list that was originally distributed by Adam Albright and Bruce Hayes; see Bruce Hayes's page.

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp P b B m M f F w k K g G t T d D s S n N r R l L j h
Outputp P b B m M f F w k K g G t T d D s S n N r R l L j h

Simulation Plots

/media/latin/paradigms/classical/simulation/insep_plots.png


Simulation details for Latin lewis-short affricated

Input:

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 LearningData.txt Features.txt [download] [view] tʃ, dz, dʒ None
2 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp p: b b: m m: f f: w k k: g g: t t: d d: s z ʃ ʒ s: n n: r r: l l: j h
Outputp p: b b: m m: f f: w k k: g g: t t: d d: s z ʃ ʒ s: n n: r r: l l: j h tʃ dz dʒ

Simulation Plots

/media/latin/lewis-short/affricated/simulation/insep_plots.png


Simulation details for Latin lewis-short classical

Input:

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp p: b b: m m: f f: w k k: g g: t t: d d: s z s: n n: r r: l l: j h
Outputp p: b b: m m: f f: w k k: g g: t t: d d: s z s: n n: r r: l l: j h

Simulation Plots

/media/latin/lewis-short/classical/simulation/insep_plots.png


Simulation details for Latin paradigms affricated

Input:

LearningData.txt | Features.txt

Summary of iterations:

IterationLearning Data producedFeatures producedInseparabilityNew Segments addedSegments removed
1 LearningData.txt Features.txt [download] [view] ts, tʃ, dz, dʒ None
2 No new learning data No new features [download] [view] None None

Summary of inventory changes

StageConsonant set
Inputp P b B m M f F w k K g G t T d D s z S ʃ ʒ n N r R l L j h
Outputp P b B m M f F w k K g G t T d D s z S ʃ ʒ n N r R l L j h ts tʃ dz dʒ

Simulation Plots

/media/latin/paradigms/affricated/simulation/insep_plots.png