AI Commerce Index
A 0–100 index shown prominently to merchants, with the underlying Readiness and Visibility components always visible instead of hidden.
The point of publishing this page is simple: the score needs to be explainable. Burnish separates what can move immediately inside the catalog from what takes time to change across AI engines.
A 0–100 index shown prominently to merchants, with the underlying Readiness and Visibility components always visible instead of hidden.
A single score that jumps the moment a fix is applied would be misleading. Burnish keeps the split visible so merchants know what changed now and what will show up later.
Readiness is calculated directly from catalog state. It moves when merchants improve fields, structure, and data quality inside Shopify.
| Factor | Weight |
|---|---|
| Metafield completeness | 20% |
| Description quality | 15% |
| Alt text coverage | 10% |
| Structured data / schema.org | 15% |
| Category / taxonomy correctness | 10% |
| Title optimization | 10% |
| Google Merchant Center readiness | 10% |
| Variant consistency | 5% |
| llms.txt presence | 5% |
Visibility is measured through sampled shopper-style prompts and model responses. It reflects how AI engines actually surface the merchant, not how good the catalog looks in isolation.
Generate shopper-style prompts grounded in the merchant catalog, category, and competitors.
Query official APIs across ChatGPT, Perplexity, Gemini, and Claude depending on tier.
Multi-sample responses to reduce non-deterministic noise.
Detect mentions against catalog-grounded brand and product context.
Normalize results into per-engine visibility measurements with confidence intervals.
Visibility is reported with confidence intervals so the merchant sees signal quality, not just a naked score.
Measurements are tied to model versions so merchants know when provider changes require a fresh baseline.
The methodology is designed to support explainability, including the ability to trace back into sampled queries and response evidence.
Burnish is designed to publish brand-disambiguation benchmark results and refresh them over time. The production benchmark is still being finalized, so this page intentionally sets the framework first and the measured number second.
Join the first cohort if you want to see the methodology mature alongside the product.