AI-EXECUTABLE UI REVIEW STANDARD NINE DIMENSIONS · 98 POINTS

A standard
for evaluating UI.

A spec for AI agents to audit web interfaces against nine evidence-grounded dimensions, with severity, confidence, and tolerance baked into the math. Every page that publishes against it — including this one — should be able to defend its own score.

↳ 01

Comprehensibility over decoration

UI should help users understand structure, hierarchy, state, and action with minimal cognitive effort.

↳ 02

Systems over isolated decisions

Good UI emerges from consistent type, color, spacing, state, and component systems.

↳ 03

Perceptual clarity over mathematical purity

Optical correctness and user perception take priority over rigid numeric purity.

↳ 04

Objective defects over stylistic preference

The Agent must distinguish real usability/quality failures from design taste differences.

↳ 05

High-confidence reporting over over-reporting

When evidence is weak, lower confidence, reduce score impact, or output an advisory instead of a hard defect.

§ 02
THE BUDGET

98 points. Allocated.

Eight scored dimensions divide a fixed hard-score budget. Two points remain reserved for future rule expansion. The shape of the bar tells you where the spec believes risk concentrates — visual quality, consistency, and color outweigh information architecture.

SCORE BUDGET 98 / 100 pts RESERVED · 2 pts

D1 · 11

D2 · 9

D3 · 9

D4 · 14

D5 · 10

D6 · 14

D7 · 17

D8 · 14

— · 2

100

D111

Interaction Quality

§ 03
THE MATH

How a deduction computes.

Every observed defect runs through the same formula: a rule's max deduction, scaled by severity, scaled by the agent's confidence. A low-confidence advisory still leaves a faint signal — never zero, never overwhelming.

↳ MAX DEDUCTION

3.0

↳ SEVERITY

↳ CONFIDENCE

↳ FORMULA

3.0 × 1.0 × 1.0 =

↳ DEDUCTION

3.00 pts

A P0 critical defect detected with high confidence consumes the rule's full max. Lower the confidence to Low and the same observation falls to 0.45 pts — visible, never decisive.

§ 04
PAGE ARCHETYPES

Six kinds of page.

Before any rule fires, the agent classifies the page into one primary archetype. The rules apply differently across archetypes — what counts as "first-screen clarity" on a marketing page is not what counts on a dashboard.

ARCHETYPE 01

Marketing / Landing

Value proposition, narrative sections, hero, CTA-driven.

D1.6 CTA D3.3 Emphasis D7.5 Media

ARCHETYPE 02

Dashboard / Analytics

KPI cards, charts, tables, filters, status-rich UI.

D6.2 Rows D8.6 Charts D7.2 Empty

ARCHETYPE 03

Form / Workflow

Data entry, setup, wizard, transactional flow.

D4.5 States D8.4 Click D8.7 Loading

ARCHETYPE 04

Content / Docs

Reading-heavy, article-like, documentation, knowledge pages.

D3.2 Body D1.5 Chunking D5.5 Whitespace

ARCHETYPE 05

Settings / CRUD / Admin

Dense controls, management UI, structured lists/tables.

D6.1 Same-role D6.3 States D5.2 Hierarchy

ARCHETYPE 06

Data Visualization-heavy

Chart-first, exploration-first, visual analytic environments.

D4.6b Color-only D8.6 Charts D7.4 Theme

§ 05
THE NINE DIMENSIONS

What gets measured, and how.

Every rule declares its class — OBJECTIVE defects, HEURISTIC guidance, or ADVISORY observations — alongside its evidence type and a confidence band. The right column shows max deduction; advisory rules are dashed.

§ 06
WORKFLOW

From page to report.

A six-step procedure the agent follows to build a structured issue report, with a sandbox layer that keeps the audit from breaking the page it's measuring.

STEP 0

Semantic Mapping

Classify archetype, identify CTAs, repeated groups, comparable cards, chart regions, semantic actions.

DOMSCREENSHOT

STEP 1

Screenshot Collection

Capture 1440 / 1024 / 375 px, both themes, plus state shots: hover, focus, active, loading, empty.

VIEWPORTSSTATES

STEP 2

DOM & Style Extraction

Computed styles, bounding rects, overflow metrics, pseudo-state availability, theme-aware tokens.

COMPUTEDRECTS

STEP 3

Interaction Sandbox

Intercept navigation; prefer state-triggering over route-changing; capture before/after; revert when possible.

SANDBOXBEFORE/AFTER

STEP 4

Rule Evaluation

For each applicable rule: result, severity, confidence, class, evidence, context, systemic vs local.

SEVERITYCONFIDENCE

STEP 5

Output Generation

Issue table, summary, systemic issues, coverage & confidence meta-indicators, advisory notes, fixes.

REPORTFIXES

§ 07
REFUSALS

What this spec refuses to enforce.

A standard is also a list of laws it declines to write. This spec explicitly avoids treating the following as universal hard defects — they may still surface as advisories in the right context, never as global tests.

×01

"A fixed F-pattern or Z-pattern is required."

SCAN PATH · D1.3

×02

"Closer must always be lighter."

FOREGROUND · D2.3

×03

"Alpha-based text is inherently wrong."

TEXT COLOR · D4.1

×04

"A 12-column grid is mandatory."

ALIGNMENT · D5.4

×05

"Primary CTAs must be pill-shaped."

RADIUS · D6.4

×06

"All native buttons require cursor: pointer."

FEEDBACK · D8.1

×07

"Every LIVE label must animate."

ANIMATION · D8.3

×08

"Pure black in dark mode is a fatal failure."

THEME · D7.4

×09

"Dense optical-centering analysis on every icon."

ALIGNMENT · D5.4

×10

"Language-specific micro-typography rules apply globally."

TYPOGRAPHY · D3

§ 08
RULE INDEX

Fifty rules, one table.

The full rule register, by dimension. Filter by class to see how the spec is built — objective defects anchor the score, heuristics guide it, advisories shape the conversation.

ID	Rule	Dimension	Class	Confidence	Max

§ 09
PASS THRESHOLD

Six grade bands. One default.

The spec recommends 70 as the default pass threshold — a deliberately ambitious bar that absorbs minor drift while disqualifying clearly broken pages.

90 — 98

Excellent

B+

80 — 89

Good

70 — 79

Acceptable

60 — 69

Below standard

40 — 59

Poor

00 — 39

Failing

Built to its own standard. Audited continuously. Available to be challenged.

UI DESIGN SPEC · AI-EXECUTABLE STANDARD
9 DIMENSIONS · 50 RULES · 98 HARD POINTS · 3-TIER CONFIDENCE
VISUALIZED AS A SINGLE-PAGE ARTIFACT

SELF-AUDIT

98 / 98

GRADE A