COPSD
Project Page

Constitutional On-Policy Safe Distillation

Safer distillation. Less safety tax.

Ming Wen1,2,3,* Yuxuan Liu3,4,* Kun Yang3,4 Yunhao Feng3 Zhuoer Xu3 Yuhao Sun3 Shiwen Cui3 Xiang Zheng5 Xingjun Ma1,2,† Yu-Gang Jiang1,†
1 Institute of Trustworthy Embodied AI, Fudan University   |   2 Shanghai Innovation Institute   |   3 Ant Group   |   4 Zhejiang University   |   5 City University of Hong Kong
*Equal Contribution, †Corresponding authors

Why COPSD?

Augmented OPSD within Safety Boundaries eliminates geometric leakage to prevent response and entropy collapse.
Comprehensive 12-Benchmark Evaluation enables our 4B student model to surpass the 235B Oracle's safety performance.
Native VERL & vLLM Implementation provides a high-throughput, cluster-ready production pipeline with accelerated token rollouts.
Anti-Over-Refusal & Context-Aware Data mitigates hyper-conservatism through open-sourced multimodal splits that master subtle real-world environments.

Two Stages

Cross-SFT first calibrates the teacher. Then constitution-conditioned OPSD distills safer behavior without collapsing expressiveness.

Overview of the COPSD framework
Stage 1. Cross-SFT cold-start. Stage 2. Constitution-conditioned on-policy distillation.

Less Text, More Signal

BibTeX

Copy & cite
@article{wen2026copsd,
  title        = {Constitutional On-Policy Safe Distillation},
  author       = {Ming Wen and Yuxuan Liu and Kun Yang and Yunhao Feng and Zhuoer Xu and Yuhao Sun and Shiwen Cui and Xiang Zheng and Xingjun Ma and Yu-Gang Jiang},
  journal      = {arXiv preprint arXiv:2606.03089},
  year         = {2026},
  url          = {https://arxiv.org/abs/2606.03089}
}