Towards LLM Privacy and Safety Grounding

Data privacy.

Department of Computer Science

Location: Gateway North 303 and Zoom

Speaker: Haoran Li, Postdoc, the Hong Kong University of Science and Technology (HKUST)

ABSTRACT

The rise of large language models (LLMs) brings new privacy and safety challenges that go beyond sensitive pattern detection. This talk advances LLM privacy and safety grounding, aiming to align model behavior with contextual norms and legal principles. We identify key challenges for current research, including the absence of rule-grounded datasets, reasoning frameworks, and policy-compliant models. Then, we propose solutions integrating the Contextual Integrity theory, retrieval augmented generation, and reinforcement learning to address these challenges. By developing benchmarks and reasoning pipelines for context-aware privacy and safety assessment, we show the existing LLMs' limitations for privacy and safety grounding. Lastly, we implement safety grounding models for legal compliance, agentic pipelines and general policies. By grounding LLMs in explicit policies rather than implicit patterns, this work lays the foundation for policy-aligned and generalizable AI safety mechanisms.

BIOGRAPHY

Haoran Li.

Haoran Li is currently a postdoc at the Hong Kong University of Science and Technology (HKUST) and a visiting scholar at Cornell Tech, The Digital Life Initiative (DLI). He received his B.Eng. and Ph.D. degrees in Computer Science from HKUST, where he was advised by Professor Yangqiu Song. Haoran’s research focuses on the privacy and safety of large language models (LLMs), including attacks and defenses related to jailbreaks, prompt injections, information leakage, and backdoors. His recent work explores contextualized safety and privacy to safeguard foundational models while aligning them with policies, regulations, and social norms. Haoran is a recipient of the Jockey Club STEM Early Career Research Fellowship, and his research on contextualized privacy received the Outstanding Paper Award at EMNLP 2024.

At any time, photography or videography may be occurring on Stevens’ campus. Resulting footage may include the image or likeness of event attendees. Such footage is Stevens’ property and may be used for Stevens’ commercial and/or noncommercial purposes. By registering for and/or attending this event, you consent and waive any claim against Stevens related to such use in any media. See Stevens' Privacy Policy for more information.