-
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories
Paper • 2606.02060 • Published • 12 -
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
Paper • 2606.01993 • Published • 5 -
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.16k • 2 -
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
Paper • 2606.02320 • Published • 7
AI & ML interests
None defined yet.
Recent Activity
Papers
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
-
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories
Paper • 2606.02060 • Published • 12 -
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
Paper • 2606.01993 • Published • 5 -
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.16k • 2 -
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
Paper • 2606.02320 • Published • 7
models 0
None public yet
datasets 12
NJU-LINK/TELBench
Updated • 49 • 1
NJU-LINK/TVIR-Bench
Viewer • Updated • 100 • 31
NJU-LINK/CoVEBench
Viewer • Updated • 626 • 427 • 1
NJU-LINK/WebCompass
Viewer • Updated • 933 • 21.8k • 6
NJU-LINK/ViDiC-1K
Updated • 333 • 5
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.16k • 2
NJU-LINK/CodeTraceBench
Viewer • Updated • 4.32k • 3.07k • 2
NJU-LINK/OmniVideoBench
Viewer • Updated • 1k • 2.87k • 5
NJU-LINK/camerabench_binary
Viewer • Updated • 7.83k • 21
NJU-LINK/MT-Video-Bench
Updated • 102 • 4