Marketing / Landing
Value proposition, narrative sections, hero, CTA-driven.
A spec for AI agents to audit web interfaces against nine evidence-grounded dimensions, with severity, confidence, and tolerance baked into the math. Every page that publishes against it — including this one — should be able to defend its own score.
UI should help users understand structure, hierarchy, state, and action with minimal cognitive effort.
Good UI emerges from consistent type, color, spacing, state, and component systems.
Optical correctness and user perception take priority over rigid numeric purity.
The Agent must distinguish real usability/quality failures from design taste differences.
When evidence is weak, lower confidence, reduce score impact, or output an advisory instead of a hard defect.
Eight scored dimensions divide a fixed hard-score budget. Two points remain reserved for future rule expansion. The shape of the bar tells you where the spec believes risk concentrates — visual quality, consistency, and color outweigh information architecture.
Every observed defect runs through the same formula: a rule's max deduction, scaled by severity, scaled by the agent's confidence. A low-confidence advisory still leaves a faint signal — never zero, never overwhelming.
A P0 critical defect detected with high confidence consumes the rule's full max. Lower the confidence to Low and the same observation falls to 0.45 pts — visible, never decisive.
Before any rule fires, the agent classifies the page into one primary archetype. The rules apply differently across archetypes — what counts as "first-screen clarity" on a marketing page is not what counts on a dashboard.
Value proposition, narrative sections, hero, CTA-driven.
KPI cards, charts, tables, filters, status-rich UI.
Data entry, setup, wizard, transactional flow.
Reading-heavy, article-like, documentation, knowledge pages.
Dense controls, management UI, structured lists/tables.
Chart-first, exploration-first, visual analytic environments.
Every rule declares its class — OBJECTIVE defects, HEURISTIC guidance, or ADVISORY observations — alongside its evidence type and a confidence band. The right column shows max deduction; advisory rules are dashed.
A six-step procedure the agent follows to build a structured issue report, with a sandbox layer that keeps the audit from breaking the page it's measuring.
Classify archetype, identify CTAs, repeated groups, comparable cards, chart regions, semantic actions.
Capture 1440 / 1024 / 375 px, both themes, plus state shots: hover, focus, active, loading, empty.
Computed styles, bounding rects, overflow metrics, pseudo-state availability, theme-aware tokens.
Intercept navigation; prefer state-triggering over route-changing; capture before/after; revert when possible.
For each applicable rule: result, severity, confidence, class, evidence, context, systemic vs local.
Issue table, summary, systemic issues, coverage & confidence meta-indicators, advisory notes, fixes.
A standard is also a list of laws it declines to write. This spec explicitly avoids treating the following as universal hard defects — they may still surface as advisories in the right context, never as global tests.
"A fixed F-pattern or Z-pattern is required."
SCAN PATH · D1.3"Closer must always be lighter."
FOREGROUND · D2.3"Alpha-based text is inherently wrong."
TEXT COLOR · D4.1"A 12-column grid is mandatory."
ALIGNMENT · D5.4"Primary CTAs must be pill-shaped."
RADIUS · D6.4"All native buttons require cursor: pointer."
FEEDBACK · D8.1"Every LIVE label must animate."
ANIMATION · D8.3"Pure black in dark mode is a fatal failure."
THEME · D7.4"Dense optical-centering analysis on every icon."
ALIGNMENT · D5.4"Language-specific micro-typography rules apply globally."
TYPOGRAPHY · D3The full rule register, by dimension. Filter by class to see how the spec is built — objective defects anchor the score, heuristics guide it, advisories shape the conversation.
| ID | Rule | Dimension | Class | Confidence | Max |
|---|
The spec recommends 70 as the default pass threshold — a deliberately ambitious bar that absorbs minor drift while disqualifying clearly broken pages.
UI DESIGN SPEC · AI-EXECUTABLE STANDARD
9 DIMENSIONS · 50 RULES · 98 HARD POINTS · 3-TIER CONFIDENCE
VISUALIZED AS A SINGLE-PAGE ARTIFACT