CoDraft

04
Mar
What 7,308 Agent Runs Taught Me About Writing Better Skills

What 7,308 Agent Runs Taught Me About Writing Better Skills

SkillsBench tested 84 tasks across 7,308 agent runs and found curated skills boost performance by 16pp — but 19% of tasks got worse. It is a great backdrop to the things I have learnt too.
8 min read