A machine-learning method discovered a hidden clue in people’s language predictive of the later emergence of psychosis – the frequent use of words associated with sound. A paper published by the journal npj Schizophrenia published the findings by scientists at Emory University and Harvard University.
The researchers also developed a new machine-learning method to more precisely quantify the semantic richness of people’s conversational language, a known indicator for psychosis.
Their results show that automated analysis of the two language variables — more frequent use of words associated with sound and speaking with low semantic density, or vagueness — can predict whether an at-risk person will later develop psychosis with 93 per cent accuracy.
Even trained clinicians had not noticed how people at risk for psychosis use more words associated with sound than the average, although abnormal auditory perception is a pre-clinical symptom.
Neguine Rezaii, the study’s lead author, said: “Trying to hear these subtleties in conversations with people is like trying to see microscopic germs with your eyes. The automated technique we’ve developed is a really sensitive tool to detect these hidden patterns. It’s like a microscope for warning signs of psychosis.”
Rezaii began work on the paper while she was a resident at Emory School of Medicine’s Department of Psychiatry and Behavioral Sciences. She is now at fellow in Harvard Medical School’s Department of Neurology.
The onset of schizophrenia and other psychotic disorders typically occurs in the early 20s, with warning signs – known as prodromal syndrome – beginning around age 17. About 25 to 30 per cent of young people who meet criteria for a prodromal syndrome will develop schizophrenia or another psychotic disorder.
Using structured interviews and cognitive tests, trained clinicians can predict psychosis with about 80 per cent accuracy in those with a prodromal syndrome. Machine-learning research is among the many ongoing efforts to streamline diagnostic methods, identify new variables, and improve the accuracy of predictions. Currently, there is no cure for psychosis.
For the current paper, the researchers first used machine learning to establish “norms” for conversational language. They fed a computer software program the online conversations of 30,000 users of Reddit, a social media platform where people have informal discussions about a range of topics. The software program, known as Word2Vec, uses an algorithm to change individual words to vectors, assigning each one a location in a semantic space based on its meaning. Those with similar meanings are positioned closer together than those with far different meanings.
The Wolff lab also developed a computer program to perform what the researchers dubbed “vector unpacking,” or analysis of the semantic density of word usage. Previous work has measured semantic coherence between sentences. Vector unpacking allowed the researchers to quantify how much information was packed into each sentence.
After generating a baseline of “normal” data, the researchers applied the same techniques to diagnostic interviews of 40 participants that had been conducted by trained clinicians, as part of the multi-site North American Prodrome Longitudinal Study (NAPLS), funded by the National Institutes of Health. NAPLS is focused on young people at clinical high risk for psychosis.
The automated analyses of the participant samples were then compared to the normal baseline sample and the longitudinal data on whether the participants converted to psychosis.
The results showed that higher than normal usage of words related to sound, combined with a higher rate of using words with similar meaning, meant that psychosis was likely on the horizon.
Rezaii is now gathering larger data sets and testing the application of their methods on a variety of neuropsychiatric diseases, including dementia.