* co-first author, † corresponding author
Preprint
-
Turning Shields into Swords: Leveraging Safety Policies for LLM Safety Testing.
Xiaoyue Lu*, Xianglin Yang*, Haijun Liu, Kuntai Cai, Meiyi An, Jun Yuan, Yan XIAO, Jin Song Dong.
Preprint 2025.
-
Beyond Manuals and Tasks: Instance-Level Context Learning for LLM Agents.
Kuntai Cai, Juncheng Liu, Xianglin Yang, Zhaojie Niu, Xiaokui Xiao, Xing Chen.
Preprint 2025.
-
Sparse Layer Sharpness-Aware Minimization for Efficient Model Fine-Tuning.
Yifei Cheng, Xianglin Yang, Li Shen.
Preprint 2025.
-
TraceAegis: Securing LLM-based Agents via Hierarchical and Behavioral Anomaly Detection.
Jiahao Liu, Bonan Ruan, Xianglin Yang†, Zhiwei Lin, Yan Liu, Yang Wang, Tao Wei and Zhenkai Liang.
Preprint 2025.
-
Turning Bias into Bugs: Bandit-Guided Style Manipulation Attacks on LLM Judges.
Xianglin Yang, Bryan Hooi, Gelei Deng, Tianwei Zhang, Jin Song Dong.
Preprint 2025.
-
Make Your Guard Learn to Think: Defending Against Jailbreak Attacks with Safety Chain-of-Thought.
Xianglin Yang, Gelei Deng, Jieming Shi, Tianwei Zhang, Jin Song Dong.
Preprint 2025.
-
Neural Surveillance: Live-Update Visualization of Latent Training Dynamics.
Xianglin Yang, Jin Song Dong.
Preprint 2024.
2025
-
When Audio and Text Disagree: Benchmarking Text Bias in Large Audio-Language Models under Cross-Modal Inconsistencies.
Cheng Wang, Gelei Deng, Xianglin Yang, Han Qiu, Tianwei Zhang.
EMNLP 2025.
2023
-
DeepDebugger: An Interactive Time-Travelling Debugging Approach for Deep Classifiers.
Xianglin Yang, Yun Lin, Yifan Zhang, Linpeng Huang, Jin Song Dong, Hong Mei.
ESEC/FSE 2023 .
-
Thompson Sampling with Less Exploration is Fast and Optimal.
Tianyuan Jin, Xianglin Yang, Xiaokui Xiao, Pan Xu.
ICML 2023 .
2022
-
Debugging and Explaining Metric Learning Approaches: An Influence Function Based Perspective.
Ruofan Liu, Yun Lin, Xianglin Yang, Jin Song Dong.
NeurIPS 2022 .
-
Temporality Spatialization: A Scalable and Faithful Time-Travelling Visualization for Deep Classifier Training.
Xianglin Yang, Yun Lin, Ruofan Liu, Jin Song Dong.
IJCAI 2022 .
[code]
[website]
-
Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision Based Approach.
Ruofan Liu, Yun Lin, Xianglin Yang, Siang Hwee Ng, Dinil Mon Divakaran, Jin Song Dong.
USENIX Security 2022 .
[code]
[website]
-
DeepVisualInsight: Time-Travelling Visualization for Spatio-Temporal Causality of Deep Classification Training.
Xianglin Yang*, Yun Lin*, Ruofan Liu, Zhenfeng He, Chao Wang, Jin Song Dong, and Hong Mei.
AAAI 2022. [oral presentation, 4.5%].
[paper]
[video]
[code]
[website]