An impassioned op-ed about, say, a social justice issue is intended to move you. It’s deliberately crafted to elicit an emotional response. But it’s also supposed to to be true. An op-ed in a fact-checked newspaper or magazine may be both emotional and informational.
That’s where it differs from fake news, which is high on emotion and low on truth. That’s one of the characteristic “tells” of fake news that K.P. (Suba) Subbalakshmi, founding director of Stevens Institute for Artificial Intelligence and professor of electrical and computer engineering, has found in her work using artificial intelligence (AI) and machine learning to separate truth from fiction online.
Subbalakshmi has designed an AI engine for detecting fake news. To understand what she needed to teach it to look for, she combed the technology literature for qualitative studies of fake news. A few characteristics emerged.
One is higher emotional content. “Fake news tends to be extremely provocative and emotionally out there on the negative side,” she said. “And when a piece of news or a post sounds like that, it tends to propagate faster and wider. Because of that, we found a way to extract these emotions automatically, without human intervention.”
Another is vocabulary complexity. “How rich is the vocabulary in that tweet? Is it sophisticated? Or is pitched at a very low level?” Less sophisticated language is associated with fake news. It also tends to have a high imagery score, which is a text’s ability to create a vivid picture in your mind. “All these things tend to be higher on a fake news piece,” she said.
Another telling characteristic is related less to the content itself and more to when it was shared, Subbalakshmi said. On Twitter, for instance, they measure the amount of time between an account’s creation and the tweeting of fake news. Bots move fast: fake news starts to emerge from them almost immediately.
“We feed these as some of the parameters into the engine that detects whether something's fake or not,” Subbalakshmi said. “And it has worked.”
Of course, none of these alone are necessarily indications of fake news; any one of them on its own might be considered a sign of good writing: accessible, evocative, or packing an emotional punch. But fake news often combines all of them.
Another tactic for unearthing fake news is to focus on the relationships between entities. We know these are false because we understand that the relationships between the entities are false.
But can AI detect fake news by identifying the fake relationships between entities? That’s what Yue Ning, Stevens Institute of Technology assistant professor of computer science, hopes the algorithm she is developing can do.
Ning and her Ph.D. students at the Schaefer School of Engineering & Science train their AI on open-source data sets such as FakeNewsNet, Celebrity, and Twitter, relying on natural language processing encoding models like Google's BERT (Bidirectional Encoder Representations from Transformers).
Created by Arizona State University, FakeNewsNet’s stories are labeled either true or false. “So you can train your model to learn what is real and what is fake,” Ning said. “My team has been focused on detecting fake news by looking at the content. For example, the writing style, the grammar, the word usage.”
They’re also incorporating knowledge graphs into the algorithm. A knowledge graph organizes information about connections between pairs of entities. “We believe that sometimes when a news article is fake, it’s not only because of the way they write, but because they make up some fake relations between entities—between, for example, celebrities, or politicians. We think using knowledge graphs as a way to organize relationships among entities can improve the fake-news detection performance.”
These fake relationships are the hallmarks of conspiracy theories—ideas that propagate falsehoods by asserting that plots, or secret plans or agreements (especially by political powers) are at large when other explanations are more likely. Fake news fuels such theories, so much so that fake news and conspiracy theories are almost interchangeable terms. Because of these similarities, algorithms that are designed to detect fake news could also be used to identify conspiracies.
Despite its seemingly high-tech foundation, mass data gathering actually requires a lot of human labor, Subbalakshmi said, so many researchers rely on institutions with the capacity to create massive open-source data sets. This can lead to a time lag in the kinds of fake news they’re able to analyze. Fabrications connected to some of 2020’s biggest fake-news-generating topics await analysis in the future.
Ning pointed out that their focus is less on the specific kind of fake news than developing new technologies to find it in the first place. Nevertheless, they’ve had their algorithms comb through one public data set for COVID news and posts. “We focus on a different problem, which is, we are detecting hate speech against Asians in COVID-19. So it's not necessarily related to misinformation or fake news. But I think if we do have some COVID-19 related misinformation data, we will be able to apply the technologies we developed on that specific data and find out if there is any correlation between geo locations and fake news outbreak or fake news spreading, or a correlation between political opinions and the spreading of fake news.”
Learn more about the electrical and computer engineering at Stevens:
Learn more about computer science at Stevens: