Research & Innovation

This AI Predicts Binge-Drinking from Smartphone Data

Stevens researcher co-develops non-invasive system to predict alcohol abuse, aid interventions; may also be useful for studying quarantines

Consuming alcohol can slow reactions and perceptions, including of one's own condition. An estimated 10,000 Americans (30 per day) die in drunk driving accidents annually, while increases in violent behavior and sexual assault have also been linked to heavy alcohol consumption.

To address this challenge, an artificial intelligent-powered heavy-drinking event prediction system developed by a Stevens professor and her collaborators is designed to predict — and intervene — before drinking gets out of hand.

"This began as an effort to try to address a really serious problem on college campuses, which is uncontrolled alcohol drinking that can produce traffic fatalities, violent behavior, emergency hospital visits or even worse," notes researcher Sang-Won Bae, who began developing the detection algorithm using data collected by a mobile sensing framework known as AWARE while a postdoctoral researcher at Carnegie Mellon University before joining Stevens' School of Systems & Enterprises in 2019.

"It's significant that the research group working on this, which integrates several partners in addition to Stevens, includes doctors in the department of emergency medicine at the University of Pittsburgh Medical Center."

Background data on location, movements, smartphone interactions

The AWARE app works by studying streams of sensor data continuously collected by modern smartphones, including the user's location, motion, phone usage and social interaction.

This so-called passive sensing actually provides important clues about one's upcoming behavior in the coming hours, Bae found.

"Smartphones have already been used, in studies, to measure how alcohol affects motor coordination; using the accelerometer to measure intoxication, through gait analysis; using mobile crowd sensing to classify drinking nights; and using social media to identify alcohol and non-alcohol related posts," notes Bae.

Professor Sang-Won BaeSchool of Systems & Enterprises professor

"My studies, however, focus on detection of those who have already started to drink or are intoxicated, which is not useful for supporting interventions to prevent drinking behavior before it happens." 

Bae's group began by extracting 76 features from phone sensors and data streams, including day of week, time of day, location, gait and other body movements, calling and texting behavior, application usage, battery status — and the number of WiFi hotspots available nearby (which can indicate socializing in situations such as parties or bars).

The detection system did not view, read or analyze any texts or emails nor check for spelling accuracy, collecting metadata only.

"That's important," notes Bae. "We want this system to be as privacy-noninvasive as possible."

It was already known that the day of the week is one of the most powerful predictors for heavy drinking episodes (HDEs), defined as four to five drinks by the National Institute on Alcohol Abuse and Alcoholism: as the weekend approaches and arrives, HDEs skyrocket, then tail off on Sunday and Monday before beginning to uptick again midweek.

But Bae's research, which has been ongoing since 2017, has produced surprising insights into which additional data points are useful in predicting drinking behavior.

Subjects who don't unlock or use their phones much during the daytime hours, for example, are more likely to engage in drinking heavily later that evening. Remaining at one location all day foreshadows more drinking events than moving around to several areas. People who contact fewer people by phone or text during the day appear more likely to drink later; longer phone calls during the daytime seem correlated with less intense drinking later.

"These factors may be due to social support networks, and one's sense of isolation," Bae suggests, "although this needs to be studied in greater detail."

Regularly traveling farther away from one's work or home, as expected, seems to correlate with and predict a higher likelihood to drink more. Our walking gaits and motions during the daytime don't predict future drinking, however.

"Even though our motion data certainly does indicate heavy drinking as it is happening, and applications have been developed to monitor and signal this," she notes, "this is not currently a focus of my group's research."

Instead, the team is focusing on predicting and intervening before drinking has begun in order to try to prevent HDEs.

**Highly accurate predictions**

With insights from the previous research in hand, the researchers have now optimized their machine learning model even further to draw on 25 of the most useful sensor features collected via the AWARE app.

"We install the AWARE app and collect data," explains Bae. "Then we pre-process sensor features and extract the key features to develop and train the machine learning model to predict HDEs."

Indeed, the group's latest HDE prediction algorithm analyzes windows of the sensor data on the fly better than ever, making predictions about upcoming behaviors on the same day.

During pilot tests installing the system on the smartphones of more than 140 subjects aged 21 to 28 who self-reported their drinking events for up to 14 weeks, the HDE algorithm has proven highly accurate at predicting a forthcoming episode of heavy drinking later the same day or night.

Next steps will include further refinement of machine learning models to further improve prediction of heavy drinking events — "that data point is really our focus here: the dangerous behavior," says Bae — as well as possible collaboration on the design of future algorithmic-based interventions) that could head off episodes of heavy drinking. 

"We need to be careful about falsely predicting heavy drinking," she cautions. "Random text messages to socially isolated users, late in the afternoons, very late in the week — one possible strategy that would seem to be promising for study — could annoy, threaten or anger a user, rather than cheering them up, and this could encourage the very behaviors we are trying to curtail or prevent."

Any eventual prediction and intervention system would not need to harvest or store large amounts of user data, Bae says, noting that her team's work has demonstrated that a phone only needs to store a six-hour window of data to successfully and accurately execute its prediction models.

Bae’s algorithmic-based prediction system could also yield applications and interventions for other diseases, behaviors and medical conditions, including in quarantine conditions.

"Since we are also passively capturing activity level, gait and behavior, one might imagine applications such as to predict recreational drug use, heavy opioid use, Parkinson's disease, an epileptic seizure or a diabetic shock event," she notes.

"Also, with the current heightened interest in quarantines, analyzing the passive sensor data could enable us to study or prevent substance overuse and certain medical conditions under those conditions."