The Hidden Patterns in Us: AI to Predict Human Events
Stevens computer scientist Yue Ning designs AI to unlock patterns in data that can forecast civil unrest, hate speech, epidemics — maybe even a looming heart attack
Yue Ning knows there are patterns in our behavior. She also knows people can’t possibly spot them all: The data’s too big, the calculations too vast.
Instead, the Stevens computer science professor creates and trains systems to do, with remarkable speed and accuracy, what people cannot do: see hidden patterns in what’s already there, invisible to us because the haystack of data is simply too big for us to ever search through entirely.
“It’s about discovery,” she explains, “about finding matches in data and groups of data clusters.”
For her efforts, Ning received a five-year National Science Foundation (NSF) CAREER award — a prestigious grant program that supports promising early-career investigators and educators — in 2021.
Now the research taking place at Stevens, made possible by Ning’s NSF funding and more like it, explores far-ranging, AI-powered methods that promise to foretell a variety of human events such as hospital visits, flu outbreaks, even riots and mass protests.
It all comes down, says Ning, to patterns and predictions.
Forecasting the flu, using AI
For one project that attracted national attention — including in the leading technology publication TechTarget — Ning and her students developed an AI that predicted influenza outbreak locations, up to 15 weeks in advance, 11% more accurately than existing tools.
While previous epidemic forecasting tools had analyzed infection rates as they changed over time, Ning’s team at Stevens created a graph neural network (GNN) to encode flu infections as interconnected regional clusters.
The group then trained the AI on both U.S. state data (sourced from the Centers for Disease Control) and Japanese regional flu data to see if it worked as well or better than state-of-the-art tools.
"Imagine each state is a node,” she explains about the AI. “You build a graph of nodes, you know the nearby locations. There are also historical patterns of outbreak that are known, and those have a weight. There is a connection between historical infection rates and geographic distance. That’s known. You build those in.”
The AI finds and learns patterns of infection movement, not only in each community or state but in each location’s neighbors — since people in those neighboring communities will be interacting with each other very frequently in the present and near future.
Adding that locational data was key: It allowed the newly developed algorithm to detect hidden patterns in the ways flu infections move from one region to another — then learn from that knowledge as it predicted the next hot spots for the virus.
“By enabling better resource allocation and public health planning,” says Ning, “this tool will have a big impact on how we cope with influenza outbreaks.”
Ning also designed the AI to report back on itself, gleaning useful insights into its predictions that will help the researchers as they continue to refine it.
”Transparency in AI models is of the utmost importance,” she explains. “Where most other AI forecasts use ‘black box’ algorithms, we’re able to explain why our system has made specific predictions — and how it thinks outbreaks in different locations are impacting one another.”
Ning’s team has conducted preliminary analyses of factors such as international travel, quarantine policy and news reports to see if the team’s AI could be applied to COVID-19 outbreak predictions.
“It’s too early to tell if it can predict [COVID-19] outbreaks as well as it does for influenza AI,” she notes. “Having location-coded data on vaccination rates would be very helpful. But sourcing that information isn’t easy.”
Spotting medical red flags in time
Ning’s team also develops models to predict medical diagnoses based on known and learned patterns in medical records.
Working with partners such as IBM and Virginia Tech, she has co-developed an AI that works to discover hidden factors in worsening health conditions from a variety of data sources including clinical notes, domain knowledge and inpatient visit histories — then uses that AI framework to predict when a patient is likely to return to the hospital and for what medical condition.
“Let’s say someone with hypertension is likely to be diagnosed with heart failure in the future,” she says. “How do we capture that relationship? How do we capture the ontology of what clinical practitioners know to be true? Maybe this disease belongs to the same category as that one. We can build that into the graphs.”
And, once again, the GNN flavor of AI proves highly useful for the task.
“We can build a graph neural network between diseases and patients. We can focus on specific diseases, such as heart failure, and predict the likelihood of readmission to the hospital by studying big data of many medical histories and events.”
The tool, she says, could prove useful for medical teams and personal physicians — for example, in an effort to predict patients’ future health risks and make dietary or other lifestyle changes.
Toward a ‘socially responsible AI’
Ning is also passionate about what she calls “socially responsible AI”: AI that can move online communities into increasingly healthy, non-confrontational, truthful and bias-free discussions.
Like most computer scientists, Ning would like the learning models she works with to be fairer and more equitable in order to produce better results.
“All data have bias,” she says. “As scientists, we want to de-bias data as much as we can. As people, it’s also the right thing to do.”
For one of her recent projects she’s created an AI that can spot anti-Asian hate speech in online discussions or other written communications by inspecting hashtags and their relationships to their associated content and users.
Since machine learning can be trained to recognize sentiment, the AI is making predictions about comments that are likely hateful.
“That can be difficult; sometimes a hashtag is used for an opposite meaning, as irony,” says Ning. “But overall, we find we can utilize relationships to make predictions about whether a tweet is hate speech, counter-hate speech or neutral.”
Her teams also is developing tools to flag so-called “fake news.”
Working with several Ph.D. students, Ning created and trained a machine learning model on open-source data sets such as FakeNewsNet, Celebrity and Twitter, relying on natural language-processing encoding models.
"You can train your model to learn what is real and what is fake,” says Ning. “My team has been focused on detecting fake news by looking at the content. For example, the writing style, the grammar, the word usage.”
Her team incorporates knowledge graphs into algorithmic processes to better indicate connections between related pairs of entities.
“Sometimes, when a news article is fake,” says Ning, “it’s because fake relationships between entities are being presented as though true. We think using knowledge graphs to organize relationships among entities can improve fake-news detection a great deal.”
Fake relationships are also a hallmark of conspiracy theories that propagate falsehoods, she adds, so algorithms designed to detect fake news might also be used to accurately identify obviously false conspiracy stories as they blossom online.
A tool like this could help social media moderators cull false information from online platforms in real time, before they could be widely dispersed and potentially cause harm.
Predicting strikes, protests, political events
Ning’s teams have also created AI systems that can analyze and predict political and social phenomena such as mass protests, strikes or labor stoppages.
"You can try to predict the number of political or societal events in a given location, but the causal factors are the key —and different locations in the world have very different reasons for unrest,” she points out.
“In Thailand, for example, a rice scandal provoked large-scale riots. In other countries, educational protests sometimes break out due to the growing influence of democracy. The January 6 Capitol riot in Washington had other causes.”
The price of gas, an increase in corporate layoffs, a vivid example of police brutality or passage of a controversial new law could also trigger demonstrations or more violent incidents.
The AI attempts to spot precursors to unrest by feeding the system large quantities of news coverage to process — then adding broader, longer-term factors such as economic conditions or changes in political leadership to the data mix as well.
The eventual goal: a score or index that changes with events, and could usefully predict the rising possibility of protests or other societal events, which could be useful for security planning.
“As in all our work, we want to find those clues and signals that have higher predictive value,” she concludes. “That’s simply what we’re trying to get the system to do.”