Towards Interpretable and Controllable AI

the letters AI against a background of ones and zeroes

Department of Computer Science

Location: Gateway North 303 (with virtual option)

Speaker: Zining Zhu, Ph.D. Candidate, Department of Computer Science, University of Toronto


In recent years, AIs based on deep neural networks have been prevalent. However, their strong ability raises concerns about trustworthiness: Can we understand their internal mechanisms and control their behavior for beneficial purposes? Currently, I work to solidify the probing of AI models and towards controlling the AIs to generate natural language explanations. The research towards interpretable and controllable AI can help us build safe and helpful AI assistants.


a close up shot of Zining Zhu wearing glasses

Zining Zhu is a Ph.D. candidate at the University of Toronto. He is interested in understanding the mechanisms and abilities of neural network AI systems. He aims to build reliable, interpretable, and trustworthy AI systems in the long term. These include developing methods to control the AIs to make correct decisions and generate helpful content. He looks forward to empowering real-world applications with the advance in AI technology.

Zoom Link: