Why Ecommerce Stores Need a noindex Audit
A noindex directive tells search engines not to include a page in their index. For ecommerce stores with thousands of URLs โ faceted navigation, filtered search results, thank-you pages, duplicate variants โ accidental or misconfigured noindex tags silently kill organic traffic. A single misplaced meta tag in a template can deindex an entire product category overnight.
This checklist covers the 12 most common noindex failure points across ecommerce sites. Each item has a binary pass/fail criterion so any developer, SEO, or store operator can run the audit without ambiguity. Work through the list in order; the first four items are the highest-impact and most frequently misconfigured.
The 12-Item noindex Audit Checklist
**1. Homepage is indexable.** Fetch the homepage HTML and search the <head> for <meta name="robots" content="noindex"> or any X-Robots-Tag: noindex HTTP header. PASS: Neither is present. FAIL: Either is present.
**2. Core category pages are indexable.** Check your top 10 highest-revenue category URLs with a crawl tool or browser plugin. PASS: No noindex tag or header on any of them. FAIL: One or more returns noindex.
**3. In-stock product pages are indexable.** Pull a sample of 50 active, in-stock product URLs. PASS: Zero noindex directives found. FAIL: Any in-stock product carries noindex.
**4. Out-of-stock product pages have an intentional noindex decision.** Decide deliberately: either keep out-of-stock pages indexed with a restock notice, or noindex them. PASS: Every out-of-stock page matches the documented policy. FAIL: Out-of-stock status is inconsistently applied or unreviewed.
**5. Faceted navigation and filter URLs are consistently handled.** Filtered URLs (e.g., /shoes?color=red&size=10) should either be noindexed or canonicalized to the base category. PASS: All filter parameter URLs have either noindex or a canonical pointing to the non-filtered version. FAIL: Filter URLs are indexed without canonicals, creating duplicate content at scale.
**6. Internal search result pages are noindexed.** Pages at /search?q= or equivalent expose low-quality, duplicate content to crawlers. PASS: All internal search result URLs return a noindex directive. FAIL: Any /search? URL is crawlable and indexable.
**7. Cart, checkout, and account pages are noindexed.** These pages have no organic search value and expose session data. PASS: /cart, /checkout, /account, /order-confirmation carry noindex plus a Disallow in robots.txt. FAIL: Any of these pages are indexable.
**8. Thank-you and order confirmation pages are noindexed.** PASS: Post-purchase URLs (/thank-you, /order-confirmed) carry a noindex directive. FAIL: These pages appear in Google Search Console's Coverage report as 'Indexed.'
**9. Staging or preview environments are fully noindexed.** If a staging subdomain (staging.example.com) is publicly accessible, every page must carry noindex. PASS: A crawl of the staging domain finds noindex on every URL. FAIL: Any staging URL is indexable.
**10. The robots.txt Disallow does not substitute for noindex on pages that need to be deindexed.** Pages blocked by robots.txt cannot be crawled but may still be indexed from external links. PASS: Any page that must be removed from the index carries a noindex tag AND is not solely relying on Disallow. FAIL: Disallow is the only protection on a page that needs to be deindexed.
**11. noindex is not present inside <body> or after a redirect.** Googlebot stops reading the <head> once the body begins, and noindex served after a redirect may not be respected. PASS: All noindex meta tags sit inside the <head> of the final destination URL. FAIL: Any noindex tag is placed in the <body>, or the final URL after a redirect contains a conflicting directive.
**12. Google Search Console Coverage report shows no unexpected noindex URLs.** Export the 'Excluded > Noindexed by page' list from GSC. PASS: Every URL on that list is intentionally noindexed per your policy document. FAIL: Any high-value product, category, or blog URL appears on that list.
How to Run This Checklist Efficiently
Use a site crawler โ Screaming Frog, Sitebulb, or any equivalent tool โ to export the meta robots and X-Robots-Tag values for every URL in your sitemap plus every URL found by following internal links. Filter the export to show all URLs returning noindex. Cross-reference that list against your revenue-generating pages.
For the HTTP header checks (items 1โ3 and 9), use a bulk header checker or the crawler's response headers report. For GSC-specific checks (item 12), export directly from the Index Coverage report rather than relying on crawl estimates. Running both a crawler and GSC closes the gap between what your server sends and what Google actually sees.
Document your noindex policy before running the audit. Without a written policy defining which URL types are intentionally noindexed, pass/fail criteria are subjective. A one-page policy document covering facets, out-of-stock products, and utility pages turns this checklist from a one-time task into a repeatable governance process.
The Most Common Failures on Ecommerce Sites
Items 5 and 6 โ faceted navigation and internal search pages โ are the most frequent failures on large ecommerce stores. A site with 200 product filters can generate tens of thousands of unique URLs, each a potential noindex misconfiguration. Platforms that build filter URLs dynamically often lack a global rule to noindex them, requiring a custom implementation at the template level.
Item 10 โ confusing robots.txt Disallow with noindex โ is the most dangerous misunderstanding. Store operators who Disallow /checkout/ in robots.txt but never add a noindex tag are not protected from indexation; they are only protected from crawling. If an external site links to a checkout page, Google can index it from that link signal alone without ever crawling it.
Actionable Next Step After the Audit
Prioritize fixes by revenue impact. Any FAIL on items 1โ3 (homepage, categories, products) requires same-day remediation. FAILs on items 5โ8 require a developer sprint within the current two-week cycle. FAILs on items 9โ12 are important but carry less immediate revenue risk and can be scheduled in the next cycle.
After each fix is deployed, re-crawl the affected URL set and revalidate against the checklist criteria before closing the ticket. Submit affected URLs for re-indexing via Google Search Console's URL Inspection tool where pages have been incorrectly excluded. Set a calendar reminder to run the full 12-item audit quarterly, or after any major platform upgrade that touches template files or robots.txt.