Tensor models for linguistics pitch curve data of native speakers of Afrikaans
We use tensor analysis techniques for high-dimensional data to gain insight into pitch curves, which play an important role in linguistics research. In particular, we propose that demeaned phonetics pitch curve data can be modeled as having a Kronecker product inverse covariance structure with sparse factors corresponding to words and time. Using data from a study of native Afrikaans speakers, we show that by targeting conditional independence through a graphical model, we reveal relationships associated with natural properties of words as studied by linguists. We find that words with long vowels cluster based on whether the vowel is pronounced at the front or back of the mouth, and words with short vowels have strong edges associated with the initial consonant.
READ FULL TEXT