Seeing Isn't Knowing: Do VLMs Know When Not to Answer Spatial Questions (and Why)? Paper • 2605.30557 • Published May 28 • 12
Is Position Bias in Dense Retrievers Built In-or Learned from Data? Paper • 2605.26578 • Published May 26 • 20
QUACK: Questioning, Understanding, and Auditing Communicated Knowledge in Multimodal Social Deduction Agents Paper • 2605.27068 • Published May 26 • 24
CoSPlay: Cooperative Self-Play at Test-Time with Self-Generated Code and Unit Test Paper • 2605.23491 • Published May 22 • 9
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published Apr 8 • 123
An Efficient Heterogeneous Co-Design for Fine-Tuning on a Single GPU Paper • 2603.16428 • Published Mar 17 • 51
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 248