Test Generation Quality

Testing

Evaluates each tool's ability to generate meaningful, comprehensive tests for a set of 10 JavaScript/TypeScript functions of varying complexity.

Methodology

10 functions were provided ranging from simple utilities to complex async state machines. Each tool was asked to generate comprehensive test suites. Scored on edge case coverage, assertion quality, test readability, and mutation testing survival rate.

Tool	Edge Cases (%)?	Mutation Score (%)?	Readability (/10)?	False Passes (count)?
Codium / Qodo	88	82	8.5	1
Claude Code	85	78	9	0
Cursor	75	70	8.2	2
GitHub Copilot	72	65	7.8	3