Back to Dashboard
CategoryWeight: 1.0x

Code Quality

Evaluates readability, idiomatic patterns, naming conventions, and adherence to language best practices.

Best Score

0.0

Avg Score

0.0

Tests

3

Performance Over Time — All Models

Model Rankings

1
Claude Sonnet 4.6

Category score

View
99.0BEST
Tokens10.9k
Total10.9k
2
Grok

Category score

View
94.3-4.7 pts
Tokens168.5k
Total168.5k
3
Claude Opus 4.8

Category score

View
93.5-5.5 pts
Tokens27.3k
Total27.3k
4
GPT-5.5

Category score

View
93.0-6.0 pts
Tokens41.4k
Total41.4k

Test Breakdown

Idiomatic Python

Write Pythonic code using generators, comprehensions, and context managers

Claude Sonnet 4.6
99.0
Grok
94.3
Claude Opus 4.8
93.5
GPT-5.5
93.0

TypeScript Best Practices

Use strict types, discriminated unions, and proper error narrowing

Claude Sonnet 4.6
99.0
Grok
94.3
Claude Opus 4.8
93.5
GPT-5.5
93.0

Clean Architecture Patterns

Implement repository pattern with proper dependency inversion

Claude Sonnet 4.6
99.0
Grok
94.3
Claude Opus 4.8
93.5
GPT-5.5
93.0