Test Generation Quality
TestingEvaluates each tool's ability to generate meaningful, comprehensive tests for a set of 10 JavaScript/TypeScript functions of varying complexity.
Methodology
10 functions were provided ranging from simple utilities to complex async state machines. Each tool was asked to generate comprehensive test suites. Scored on edge case coverage, assertion quality, test readability, and mutation testing survival rate.
| Tool | Edge Cases (%)Higher is better | Mutation Score (%)Higher is better | Readability (/10)Higher is better | False Passes (count)Lower is better |
|---|---|---|---|---|
| Codium / Qodo | 88 | 82 | 8.5 | 1 |
| Claude Code | 85 | 78 | 9 | 0 |
| Cursor | 75 | 70 | 8.2 | 2 |
| GitHub Copilot | 72 | 65 | 7.8 | 3 |