
15 min read
A Structured Benchmark for AI Paper Review
Refine won 90.4% of 1,349 head-to-head matches against single-shot LLM reviewers and scaffolded review systems on 150 economics preprints.

Refine won 90.4% of 1,349 head-to-head matches against single-shot LLM reviewers and scaffolded review systems on 150 economics preprints.