Step 1: Crawler Access Check
Open your browser and visit yourstore.com/robots.txt. This is the file that tells AI crawlers whether they are allowed to read your content. Search the file for five specific bot names: GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, and Google-Extended. If ANY of these appear in a Disallow rule, fix it immediately โ you are actively blocking the AI systems that could be citing you right now.
Next, check your server logs or CDN analytics for evidence of AI crawler visits in the last 7 days. Most CDN dashboards (Cloudflare, Vercel, Fastly) show bot traffic by user agent. Look for the same five bot names. If your robots.txt correctly allows all five crawlers but you see zero visits from them, something else is blocking access โ typically a WAF rule, a bot-protection setting, or a CDN challenge page that serves a CAPTCHA to automated requests. This is more common than you think, especially on Shopify stores using aggressive bot-protection apps.
PASS: All 5 crawlers are allowed in robots.txt AND you have evidence of crawler visits in the last 7 days. FAIL: Any crawler is blocked OR you see zero AI crawler visits despite correct robots.txt configuration. Read our full robots.txt guide for AI crawlers for the exact syntax and common hosting-specific fixes.
If AI crawlers cannot access your pages, nothing else in this audit matters. Schema, content quality, authority signals โ all irrelevant if the bots that power ChatGPT, Claude, and Perplexity never see your content. This is the single most common reason stores with excellent content get zero AI citations.
Step 2: Schema Audit
Select your top 10 content pages โ your best guides, comparison posts, and FAQ pages. For each one, check four specific schema types. (1) Article schema with an author of type Person, including datePublished and dateModified fields. (2) FAQPage schema that matches the visible FAQ section on the page โ every question displayed on the page should appear in the structured data. (3) BreadcrumbList schema showing the page's position in your site hierarchy. (4) Person schema for the author with a LinkedIn sameAs URL proving they are a real, findable human.
Validate each page using the Google Rich Results Test. Paste the URL, run the test, and confirm that each schema type renders without errors or warnings. Pay attention to "missing recommended field" warnings โ these are not hard failures but they reduce your structured data's effectiveness for AI retrieval systems that use schema signals to assess content quality and authorship.
PASS: 8 out of 10 pages have all four schema types validating without errors. FAIL: Fewer than 5 out of 10 pages have complete schema. Read our schema for AI citations guide for the exact JSON-LD patterns, and use the Store SEO Grader to automate this check across your entire site in seconds.
Step 3: Content Structure Check
Return to your top 10 content pages and evaluate each against five structural criteria that determine whether AI can extract quotable, citable passages from your content. (1) Does the title match a question buyers actually ask? Not a clever headline, not a keyword-stuffed string โ a real question format that maps to how people query AI. (2) Is the answer in the first sentence? AI retrieval systems extract the opening lines as candidate citations. If your answer is buried in paragraph three after an anecdote, it will not be selected.
(3) Are claims specific โ numbers, dates, names โ not hedged? "Many stores find that" is not citable. "78 percent of stores that added FAQ schema saw citations within 30 days" is citable. AI surfaces concrete, verifiable claims because they can attribute them confidently. (4) Is there a FAQ section with 5 or more questions? FAQ sections are the most frequently cited content format because the question-answer structure maps directly to how people query AI. (5) Can a paragraph be quoted out of context and still make complete sense? Each paragraph should be a self-contained unit of meaning โ no "as mentioned above" or "building on the previous point" that makes the passage meaningless when extracted.
PASS: 7 out of 10 pages score 4 or higher out of 5. FAIL: Fewer than 5 out of 10 pages score 4 or higher. Read our citability content guide for the full framework of writing that AI retrieval selects. Use the Headline Analyzer to verify your title format matches question-intent patterns.
Step 4: Authority Signal Check
AI retrieval systems do not cite anonymous content. They cite content from identifiable humans at identifiable organizations because attribution requires a source name. Check every content page on your site for five authority signals. (1) Named author โ not "Staff," not "Admin," not "Team," but a real person's name. (2) Author bio page on your site that this name links to, showing who they are and why they are qualified to write about this topic. (3) Visible publication date so AI can assess recency. (4) Organization or About page establishing who publishes this site. (5) Author has published 5 or more pages on this topic on your domain, demonstrating sustained expertise rather than a one-off opinion.
These signals are not vanity features. They are the trust markers AI retrieval uses to decide between two pages that answer the same question equally well. When two sources have identical content quality, the one with identifiable authorship, organizational backing, and demonstrated expertise wins the citation. Every time. This is E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness) applied specifically to AI selection criteria.
PASS: All 5 authority signals are present consistently across your content pages. FAIL: Any signal is missing site-wide. Read our E-E-A-T for AI search guide for implementation patterns that work for ecommerce stores where the "author" is often the store owner or category buyer rather than a traditional journalist.
Step 5: Topic Depth Check
Count your content pages per topic cluster. A topic cluster is a group of interlinked pages covering different facets of a single subject โ not random blog posts, but structured coverage where each page answers a distinct question within one domain. How many distinct topics do you cover with 10 or more pages? Be honest here. A collection of loosely related posts is not a cluster. A cluster has a pillar page, supporting articles, FAQ content, comparisons, and tool pages all addressing the same core subject with internal links connecting them.
AI cites from deep domains. When an AI retrieval system evaluates whether your store is authoritative enough to cite on "best running shoes for flat feet," it does not look only at that one page โ it assesses how much your domain covers running shoes overall. A store with 40 pages on running shoes (pillar guide, brand comparisons, condition-specific recommendations, FAQ hubs, sizing guides, price-point breakdowns) will be cited over a store with 3 articles, even if those 3 articles are individually well-written. Depth signals domain authority. Thin clusters signal that you are not a primary source.
PASS: At least 2 clusters with 15 or more pages each. FAIL: No cluster exceeds 10 pages. Use the Niche Authority Score tool to benchmark your cluster depth against competitors currently earning citations. Check our topic clusters guide for the hub-and-spoke structure that AI retrieval rewards. The Competitor Content Counter shows exactly how many pages cited competitors have in each topic area, so you know the depth target to hit.
Step 6: Live Citation Test
This is the reality check that turns the audit from theory into evidence. Search 10 of your target queries โ the questions your ideal customers ask that relate to products you sell โ across three AI surfaces: ChatGPT (with browsing enabled), Perplexity, and Google (check the AI Overview at the top of results). For each query across each surface, record four things. (1) Does an AI answer appear at all? Not every query triggers AI โ if it does not, that query is not an AI citation opportunity right now. (2) Are you cited? Look for your domain in the sources, footnotes, or linked references. (3) Who IS cited instead? Record the exact domains and page URLs that earned the citation you want. (4) What do their cited pages do that yours do not? Visit those pages โ check their schema, content structure, author attribution, FAQ sections, and cluster depth.
This step connects the previous five checks to observable reality. If competitors are being cited and you are not, the specific differences between their pages and yours will map directly to one or more of steps 1 through 5. Maybe they have FAQPage schema and you do not. Maybe their content opens with the answer and yours buries it. Maybe they have 30 pages on the topic and you have 4. The live citation test tells you WHERE you stand; the previous five steps tell you WHY and exactly what to fix.
Read our guide on queries that trigger AI answers to identify which of your target queries are citation opportunities. Use the Content Gap Analyzer to systematically compare your content coverage against the stores that are earning the citations you want.
Scoring and Priority
Score your store 0 to 6 โ one point for each step that passes. This score determines your priority actions and realistic timeline for earning AI citations.
Score 0-2: Invisible. AI cannot cite you right now, regardless of content quality. Your immediate priority is fixing crawler access and adding schema markup โ these are the technical prerequisites that gate everything else. Week 1: fix robots.txt, remove WAF blocks on AI bots, add Article and FAQPage schema to all content pages. Do not invest in new content until these foundations are in place. Content without crawlability and schema is invisible content.
Score 3-4: Partially visible. AI can find and read your content, but it is not selecting you over competitors. Your content either lacks the structural features AI retrieval prefers (FAQ sections, answer-first prose, specific claims) or your domain lacks the depth to signal authority. Weeks 2 through 4: add FAQ sections with schema to every content page, restructure your opening paragraphs to lead with the answer, replace hedged language with specific claims, and publish 10 to 15 pages to deepen your strongest cluster. These changes compound โ each one makes every other page on your domain marginally more citable.
Score 5-6: Citation-ready. Your technical foundation, content structure, and authority signals are in place. Your priority shifts from fixing to scaling โ increase content velocity, build additional topic clusters, and track citations monthly to identify new query opportunities as they emerge. You are in the small minority of ecommerce stores that are genuinely eligible for consistent AI citations. Now it is about volume and coverage. Use the Complete Checklist for the full execution order. Read the AEO Playbook for the methodology behind sustained citation growth at scale.