Stevens News / Research & Innovation

Zining Zhu Helps Developers and Users Navigate 'Adversarial Helpfulness' in Artificial Intelligence

Stevens computer science researcher and his team are exploring the unreliability of AI models

Artificial intelligence platforms such as ChatGPT and Notion show great promise in facilitating research and problem-solving. However, even when their answers seem accurate, they can be alarmingly unreliable.

Zining Zhu, assistant professor in the Department of Computer Science at Stevens Institute of Technology, has taken on the challenge of understanding the reasons for this fallibility and offering advice to avoid being misled. The project team also includes Stevens Ph.D. student Shashidhar Reddy Javaji ’27, who earned his master’s degree in computer science last year from the University of Massachusetts; Rohan Ajwani, a recent University of Toronto graduate with a master’s degree in computer engineering; and Frank Rudzicz, an associate computer science professor at the Vector Institute.

Their groundbreaking paper on the accuracy and limitations of large language models (LLMs), LLM-Generated Black-Box Explanations Can Be Adversarially Helpful, is being published and included as a poster for a machine learning workshop at the 2024 Conference on Neural Information Processing Systems in Vancouver. 

Artificial intelligence isn't always as smart as it seems

Large language models can be powerful tools for problem-solving and knowledge generation. The trouble is, their correct and incorrect explanations can be equally convincing, leading people—and other LLMs—to trust these mistakes. Zhu and his team describe this phenomenon as "adversarial helpfulness."

"One day, I made a typo when entering a problem into an LLM, and to my surprise, it still explained the problem smoothly," Zhu recalled. "That’s when I realized that the 'helpfulness' of these tools could work against us. Further testing proved this effect was prevalent. It’s alarming—whether you’re a professional making high-stakes decisions, a researcher attempting to solve a scientific challenge, or a child seeking to learn about the world."

Professional headshot of Stevens professor Zining ZhuAssistant Professor Zining Zhu

Javaji then documented the disturbingly high frequency of LLMs creating misleading but believable explanations by reframing questions, showing unwarranted confidence and cherry-picking evidence to support incorrect answers.

"I am thrilled to work on this innovative and impactful research," Javaji said, "leveraging our cutting-edge resources at Stevens to advance our understanding of LLM limitations and develop more reliable AI systems for a better future."

Zhu and his team continued their investigation by designing a task involving graph analysis, a known challenge for these models. They tasked the AI with finding alternative paths in graphs. Despite their struggles to solve basic assignments, the models confidently produced incorrect answers, further showcasing their logical reasoning limitations. 

The researchers also discovered that "black-box" explanations—the common practice of using LLMs to provide only the problem and the answer without revealing the reasoning process—further clouded the integrity of the responses.  

Building trust for a safer future

As AI tools evolve to function like enhanced search engines that can deliver comprehensive explanations, how can developers and users counteract adversarial helpfulness?

"Ask LLMs to explain multiple perspectives," Zhu recommended. "Seek transparency in the AI’s decision-making process. Above all, fact-check the outputs."

This isn’t just an academic exercise. As AI tools become increasingly integral to education, professional decision-making, and daily life, ensuring their accuracy and reliability is crucial.

"Our research aligns with Stevens Institute’s mission to address critical societal challenges through technological innovation," Zhu noted. "I hope this work contributes to making AI tools safer and more effective for everyone."

Learn more about academic programs and research in the Department of Computer Science:

Related Stories

iCNS AI Engineering and Science Symposium speakers (from left) Joseph Helsing, Min Song, Tianhao Wu, Yingchao Zhang, Mingyu Derek Ma,  Yulong Cao,  Hao Wang and Jessica Gruich standing in front of a stage and podium.
February 10, 2026
Stevens Student-Led AI Innovation Takes Center Stage at iCNS AI Engineering and Science Symposium
February 05, 2026
Pinar Akcora Receives NSF Grant to Study How Ions and Polymers Interact in Ionogels 
January 30, 2026
Stevens Students are Mapping the Future of AI Data Centers
January 15, 2026
Igor Pikovski Receives $1.3 Million Keck Foundation Grant, a First for Stevens
More Research & Innovation

Stevens News

School of Computing
January 29, 2026
Stevens Institute of Technology Establishes School of Computing to Lead the Next Era in AI and Technology Education and Research
February 19, 2026
Stevens Institute of Technology Announces Two New Programs to Support Research-Based Startup Formation and Industry Collaboration
December 09, 2025
Stevens Ranks No. 1 in New Jersey for Graduate Earnings in New Federal Salary Data
February 19, 2026
Stevens Institute of Technology Installs Advanced X-Band Weather Radar to Strengthen Flood Monitoring and Severe Weather Forecasting for the New York Metropolitan Area
All Stevens News