Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Jim White PRO

jimwhite

·

jimwhite

AI & ML interests

None yet

Organizations

jimwhite 's collections 6

LiquidAI/LFM2.5-1.2B-Instruct

Text Generation • 1B • Updated 17 days ago • 145k • 622
Token-Level LLM Collaboration via FusionRoute

Paper • 2601.05106 • Published Jan 8 • 40
ryokamoi/Qwen-2.5-7B-FoVer-PRM-old

Text Generation • 8B • Updated Apr 7 • 16 • 1

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

Paper • 2512.17008 • Published Dec 18, 2025 • 11
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 234
ryokamoi/Qwen-2.5-7B-FoVer-PRM-old

Text Generation • 8B • Updated Apr 7 • 16 • 1
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 43

DeepCode: Open Agentic Coding

Paper • 2512.07921 • Published Dec 8, 2025 • 35
nvidia/Nemotron-Pretraining-Code-v2

Viewer • Updated Dec 22, 2025 • 836M • 6.08k • 127
BEAVER: An Efficient Deterministic LLM Verifier

Paper • 2512.05439 • Published Dec 5, 2025 • 36
codefuse-ai/C2LLM-7B

Feature Extraction • 8B • Updated Jan 21 • 40 • 10

Verified Agents

VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks

Paper • 2511.04662 • Published Nov 6, 2025 • 36
Agentic Rubrics as Contextual Verifiers for SWE Agents

Paper • 2601.04171 • Published Jan 7 • 13

Coding Benchmarks

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 306
SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models

Paper • 2511.05459 • Published Nov 7, 2025 • 5
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

Paper • 2512.18470 • Published Dec 20, 2025 • 12
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published Jan 14 • 128

josancamon/kg-gen-MINE-evaluation-dataset

Viewer • Updated Sep 26, 2025 • 101 • 252 • 5
zilliz/semantic-highlight-bilingual-v1

Token Classification • 0.6B • Updated Jan 15 • 11.1k • 97
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published Jan 14 • 128

LiquidAI/LFM2.5-1.2B-Instruct

Text Generation • 1B • Updated 17 days ago • 145k • 622
Token-Level LLM Collaboration via FusionRoute

Paper • 2601.05106 • Published Jan 8 • 40
ryokamoi/Qwen-2.5-7B-FoVer-PRM-old

Text Generation • 8B • Updated Apr 7 • 16 • 1

Verified Agents

VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks

Paper • 2511.04662 • Published Nov 6, 2025 • 36
Agentic Rubrics as Contextual Verifiers for SWE Agents

Paper • 2601.04171 • Published Jan 7 • 13

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs

Paper • 2512.17008 • Published Dec 18, 2025 • 11
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 234
ryokamoi/Qwen-2.5-7B-FoVer-PRM-old

Text Generation • 8B • Updated Apr 7 • 16 • 1
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 43

Coding Benchmarks

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 306
SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models

Paper • 2511.05459 • Published Nov 7, 2025 • 5
SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

Paper • 2512.18470 • Published Dec 20, 2025 • 12
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published Jan 14 • 128

DeepCode: Open Agentic Coding

Paper • 2512.07921 • Published Dec 8, 2025 • 35
nvidia/Nemotron-Pretraining-Code-v2

Viewer • Updated Dec 22, 2025 • 836M • 6.08k • 127
BEAVER: An Efficient Deterministic LLM Verifier

Paper • 2512.05439 • Published Dec 5, 2025 • 36
codefuse-ai/C2LLM-7B

Feature Extraction • 8B • Updated Jan 21 • 40 • 10

josancamon/kg-gen-MINE-evaluation-dataset

Viewer • Updated Sep 26, 2025 • 101 • 252 • 5
zilliz/semantic-highlight-bilingual-v1

Token Classification • 0.6B • Updated Jan 15 • 11.1k • 97
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation

Paper • 2601.09688 • Published Jan 14 • 128

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs