Content Tags

There are no tags.

A machine learning approach to predicting psychosis using semantic density and latent content analysis

Authors
Neguine Rezaii, Elaine Walker, Phillip Wolff

Subtle features in people’s everyday language may harbor the signs of future mental illness. Machine learning offers an approach for the rapid and accurate extraction of these signs. Here we investigate two potential linguistic indicators of psychosis in 40 participants of the North American Prodrome Longitudinal Study. We demonstrate how the linguistic marker of semantic density can be obtained using the mathematical method of vector unpacking, a technique that decomposes the meaning of a sentence into its core ideas. We also demonstrate how the latent semantic content of an individual’s speech can be extracted by contrasting it with the contents of conversations generated on social media, here 30,000 contributors to Reddit. The results revealed that conversion to psychosis is signaled by low semantic density and talk about voices and sounds. When combined, these two variables were able to predict the conversion with 93% accuracy in the training and 90% accuracy in the holdout datasets. The results point to a larger project in which automated analyses of language are used to forecast a broad range of mental disorders well in advance of their emergence.

Stay in the loop.

Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.