Complex Segment Learner

Complex Segment Learner

C V counting Utility

The typology section of the paper discusses a parallel between the distributions of complex segments and clusters of simplex segments: the more parts in a sequence, the more rare it is. This usually holds both within and across languages. You can see whether this generalization is also true for your data.

To use this utility, give it a feature file with the feature [syll(abic)], specified with binary + and - values. The utility will treat segments that are [+] for this feature as vowels, and all the other segments as consonants. It will then count up how often each sequence occurs in the words in your learning data file. For example:
Input data file:
p a t i k a
k i t k a

The count for "V C V" will be 2, and the count for "V C C V" will be 1.

Language name


Feature file


Learning data file


Sequences to count