arxiv:2606.24636
Xinyu Mao
hector-mao
AI & ML interests
Multimodal Large Language Models, Vision Language Models
Recent Activity
authored a paper 4 days ago
CineCap: Structured Reasoning with Spatio-Temporal Anchors for Cinematographic Video Captioning updated a model 5 days ago
hector-mao/CineCap-GRPO-8BOrganizations
None yet