THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models Paper • 2504.13367 • Published 24 days ago • 24
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs Paper • 2504.04715 • Published Apr 7 • 13
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs Paper • 2504.04715 • Published Apr 7 • 13
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs Paper • 2504.04715 • Published Apr 7 • 13 • 2
The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1 Paper • 2502.12659 • Published Feb 18 • 7
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13, 2024 • 43
Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation Paper • 2203.07687 • Published Mar 15, 2022
Protecting Language Generation Models via Invisible Watermarking Paper • 2302.03162 • Published Feb 6, 2023
Weak-to-Strong Jailbreaking on Large Language Models Paper • 2401.17256 • Published Jan 30, 2024 • 16
Weak-to-Strong Jailbreaking on Large Language Models Paper • 2401.17256 • Published Jan 30, 2024 • 16