Close mobile menu

Lassonde Professor is fighting clickbait and Twitter bots with Artificial Intelligence

Users often encounter fraudulent content online, including misinformation and marketing scams. This is not only frustrating for users, but also something that can have major consequences on both a global and personal level, ranging from financial and political damages to cultural and personal disagreements and divides.

Motivated to protect internet users, Uyen Trang Nguyen, an associate professor in the Department of Electrical Engineering & Computer Science at York University’s Lassonde School of Engineering, is developing Artificial Intelligence (AI) systems to prevent the spread of fraudulent content. Specifically, Professor Nguyen is designing systems to strategically detect and target clickbait and Twitter bots; two techniques commonly used to spread misinformation and marketing scams. These systems are developed with a subfield of AI called Machine Learning (ML), which trains computers to extract patterns and knowledge from specific data and learn from it, similar to the way humans read an instruction manual before completing an unfamiliar task.

“I was inspired to start this work because I see the issues that are caused by false information on the internet,” says Professor Nguyen. “Families have serious disagreements, there is interference with our elections, I want to provide protection against these threats.”

To combat the spread of fraudulent content and misinformation, Professor Uyen is developing a system that analyzes the relationship between words in an article or on a webpage to detect clickbait. This system operates using a combination of methods that have not been used for clickbait detection systems before: a neural network that can mimic our brain’s ability to recognize patterns and regularities in data, coupled with human semantic knowledge of language to understand the relationship between words. While analyzing an article or webpage, the system relies on a graph that represents the semantic relationship between words and uses this information to correlate the title of an article or webpage to its content – if the title and content do not match, it is labelled as clickbait.

An example of how Professor Nguyen’s clickbait detection model distinguishes non-clickbait from click-bait.
An example of how Professor Nguyen’s clickbait detection model distinguishes non-clickbait from clickbait.

Focusing on the spread of disinformation and misinformation on Twitter, Professor Nguyen is also developing a complex system to detect Twitter bots. Although Twitter bots can be used for convenient, automated updates of blogs and news, they are also used to spread malicious content. Professor Nguyen’s proposed system combines natural language processing with a recurrent neural network. Working together to analyze tweet content, natural language processing allows the system to understand text the way humans do, while the recurrent neural network helps the system identify language patterns used by bots. Using these methods, the system can distinguish a Twitter bot from a legitimate Twitter account.

Using these proposed systems to detect clickbait and Twitter bots, network administrators from companies such as Google or Twitter would have the ability to slow down or prevent the spread of fraudulent content before it reaches more internet users. An added feature that Professor Nguyen is developing to improve the use of these systems is explainability – this allows the systems to provide an explanation behind their decisions. “It’s hard for people to trust artificial intelligence – it’s a computer, not a person,” says Professor Nguyen. “I want to make sure these systems can explain what they are doing, so we can build trust in AI.”

Professor Nguyen is working on additional improvements to her artificial intelligence systems, including a feature that will permit her Twitter-bot detection system to distinguish between harmful and harmless bots. She is also applying machine learning methods to develop a system that can support financial institutions by detecting money laundering transactions.