← Back to Leaderboard
AI AgentsTOOL
About
A benchmark that challenges AI agents to rebuild complete programs from scratch using only compiled binaries and documentation. Tests whether language models can reverse-engineer and implement working codebases that reproduce original program behavior.
Tags
aibenchmarkingcode-generationreverse-engineeringlanguage-modelsevaluationprogramming
Tech Stack
Python
Comments
No comments yet.