Agent Almanac · est. 2026 · independent

Benchmarks
that tell you
what to deploy.

Periodic measurement of frontier and open AI agents on real-world tasks. Each issue locks the variables, runs the agents, publishes the numbers, and ships an open reproduction package the same day.

Latest issue

Latest · Issue I · Code Uniformity · Q2 2026Published 2026-05-10Methodology v0.1.0

AI code is the outlier in human codebases. AI code is the norm in AI codebases.

48 public OSS repositories, 5 languages, 3 strata of AI authorship intensity. The within-repo AI-vs-human uniformity gap reverses sign across the AI ratio range, and the relationship is roughly linear (Pearson r = 0.58).

Repos sampled48
Languages5
Functions analyzed12,254
Pearson r+0.58
Read before you trust

Every number challengeable.

Task selection, scoring, statistical handling, hardware tiers, frameworks, modalities. The rules are in writing before any run completes.