VIBE
← Back to Leaderboard
AI AgentsTOOL
AI AgentsOpen SourceTOOL1d ago737

About

A benchmark that challenges AI agents to rebuild complete programs from scratch using only compiled binaries and documentation. Tests whether language models can reverse-engineer and implement working codebases that reproduce original program behavior.

Tags

aibenchmarkingcode-generationreverse-engineeringlanguage-modelsevaluationprogramming

Tech Stack

Python

Comments

No comments yet.