Jeff
JiayuJeff
AI & ML interests
None yet
Recent Activity
upvoted a collection 30 minutes ago
awesome-agentic-benchmarks upvoted a paper about 9 hours ago
GeoBrowse: A Geolocation Benchmark for Agentic Tool Use with Expert-Annotated Reasoning Traces upvoted a paper about 10 hours ago
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool EcosystemsOrganizations
None yet