Skip to main content
Checklist

noindex Checklist: 12 Items Every Ecommerce Store Should Audit

By ยท Updated ยท 7 min read

Why Ecommerce Stores Need a noindex Audit

A noindex directive tells search engines not to include a page in their index. For ecommerce stores with thousands of URLs โ€” faceted navigation, filtered search results, thank-you pages, duplicate variants โ€” accidental or misconfigured noindex tags silently kill organic traffic. A single misplaced meta tag in a template can deindex an entire product category overnight.

This checklist covers the 12 most common noindex failure points across ecommerce sites. Each item has a binary pass/fail criterion so any developer, SEO, or store operator can run the audit without ambiguity. Work through the list in order; the first four items are the highest-impact and most frequently misconfigured.

The 12-Item noindex Audit Checklist

**1. Homepage is indexable.** Fetch the homepage HTML and search the <head> for <meta name="robots" content="noindex"> or any X-Robots-Tag: noindex HTTP header. PASS: Neither is present. FAIL: Either is present.

**2. Core category pages are indexable.** Check your top 10 highest-revenue category URLs with a crawl tool or browser plugin. PASS: No noindex tag or header on any of them. FAIL: One or more returns noindex.

**3. In-stock product pages are indexable.** Pull a sample of 50 active, in-stock product URLs. PASS: Zero noindex directives found. FAIL: Any in-stock product carries noindex.

**4. Out-of-stock product pages have an intentional noindex decision.** Decide deliberately: either keep out-of-stock pages indexed with a restock notice, or noindex them. PASS: Every out-of-stock page matches the documented policy. FAIL: Out-of-stock status is inconsistently applied or unreviewed.

**5. Faceted navigation and filter URLs are consistently handled.** Filtered URLs (e.g., /shoes?color=red&size=10) should either be noindexed or canonicalized to the base category. PASS: All filter parameter URLs have either noindex or a canonical pointing to the non-filtered version. FAIL: Filter URLs are indexed without canonicals, creating duplicate content at scale.

**6. Internal search result pages are noindexed.** Pages at /search?q= or equivalent expose low-quality, duplicate content to crawlers. PASS: All internal search result URLs return a noindex directive. FAIL: Any /search? URL is crawlable and indexable.

**7. Cart, checkout, and account pages are noindexed.** These pages have no organic search value and expose session data. PASS: /cart, /checkout, /account, /order-confirmation carry noindex plus a Disallow in robots.txt. FAIL: Any of these pages are indexable.

**8. Thank-you and order confirmation pages are noindexed.** PASS: Post-purchase URLs (/thank-you, /order-confirmed) carry a noindex directive. FAIL: These pages appear in Google Search Console's Coverage report as 'Indexed.'

**9. Staging or preview environments are fully noindexed.** If a staging subdomain (staging.example.com) is publicly accessible, every page must carry noindex. PASS: A crawl of the staging domain finds noindex on every URL. FAIL: Any staging URL is indexable.

**10. The robots.txt Disallow does not substitute for noindex on pages that need to be deindexed.** Pages blocked by robots.txt cannot be crawled but may still be indexed from external links. PASS: Any page that must be removed from the index carries a noindex tag AND is not solely relying on Disallow. FAIL: Disallow is the only protection on a page that needs to be deindexed.

**11. noindex is not present inside <body> or after a redirect.** Googlebot stops reading the <head> once the body begins, and noindex served after a redirect may not be respected. PASS: All noindex meta tags sit inside the <head> of the final destination URL. FAIL: Any noindex tag is placed in the <body>, or the final URL after a redirect contains a conflicting directive.

**12. Google Search Console Coverage report shows no unexpected noindex URLs.** Export the 'Excluded > Noindexed by page' list from GSC. PASS: Every URL on that list is intentionally noindexed per your policy document. FAIL: Any high-value product, category, or blog URL appears on that list.

How to Run This Checklist Efficiently

Use a site crawler โ€” Screaming Frog, Sitebulb, or any equivalent tool โ€” to export the meta robots and X-Robots-Tag values for every URL in your sitemap plus every URL found by following internal links. Filter the export to show all URLs returning noindex. Cross-reference that list against your revenue-generating pages.

For the HTTP header checks (items 1โ€“3 and 9), use a bulk header checker or the crawler's response headers report. For GSC-specific checks (item 12), export directly from the Index Coverage report rather than relying on crawl estimates. Running both a crawler and GSC closes the gap between what your server sends and what Google actually sees.

Document your noindex policy before running the audit. Without a written policy defining which URL types are intentionally noindexed, pass/fail criteria are subjective. A one-page policy document covering facets, out-of-stock products, and utility pages turns this checklist from a one-time task into a repeatable governance process.

The Most Common Failures on Ecommerce Sites

Items 5 and 6 โ€” faceted navigation and internal search pages โ€” are the most frequent failures on large ecommerce stores. A site with 200 product filters can generate tens of thousands of unique URLs, each a potential noindex misconfiguration. Platforms that build filter URLs dynamically often lack a global rule to noindex them, requiring a custom implementation at the template level.

Item 10 โ€” confusing robots.txt Disallow with noindex โ€” is the most dangerous misunderstanding. Store operators who Disallow /checkout/ in robots.txt but never add a noindex tag are not protected from indexation; they are only protected from crawling. If an external site links to a checkout page, Google can index it from that link signal alone without ever crawling it.

Actionable Next Step After the Audit

Prioritize fixes by revenue impact. Any FAIL on items 1โ€“3 (homepage, categories, products) requires same-day remediation. FAILs on items 5โ€“8 require a developer sprint within the current two-week cycle. FAILs on items 9โ€“12 are important but carry less immediate revenue risk and can be scheduled in the next cycle.

After each fix is deployed, re-crawl the affected URL set and revalidate against the checklist criteria before closing the ticket. Submit affected URLs for re-indexing via Google Search Console's URL Inspection tool where pages have been incorrectly excluded. Set a calendar reminder to run the full 12-item audit quarterly, or after any major platform upgrade that touches template files or robots.txt.

Frequently asked questions

What is the fastest way to find all noindexed pages on my ecommerce store?

Run a full site crawl using a tool like Screaming Frog with JavaScript rendering enabled, then filter results by meta robots value equal to 'noindex'. Cross-reference the output with your Google Search Console Index Coverage report under 'Excluded > Noindexed by page.' The combination catches both server-rendered and client-rendered noindex directives that a single source misses.

Should out-of-stock product pages be noindexed?

There is no universal rule. Pages for temporarily out-of-stock products that will restock should stay indexed โ€” removing and reindexing them repeatedly loses accumulated link equity and ranking history. Pages for permanently discontinued products with no replacement should be noindexed and eventually redirected or removed. The critical requirement is a documented, consistently applied policy, not the choice itself.

Does adding noindex to a page in robots.txt work the same as a meta tag?

No. robots.txt cannot noindex a page โ€” it can only block crawling. A page blocked by Disallow in robots.txt can still appear in Google's index if external sites link to it. To guarantee a page is removed from the index, the noindex directive must be in the page's HTTP response header or <head> meta tag, and the page must be crawlable so Google can read that directive.

How do I noindex filtered category URLs without noindexing the base category page?

Apply the noindex directive conditionally at the template level: if the URL contains any query parameters associated with facets or filters, inject the noindex meta tag. The base category URL โ€” without parameters โ€” renders the template without that condition and remains indexable. Alternatively, use canonical tags pointing all filtered variants to the base URL, though noindex is a stronger signal for preventing filter URL indexation.

Can a noindex directive be accidentally introduced by a third-party app or plugin?

Yes. Apps that add SEO headers, preview functionality, or A/B testing scripts sometimes inject noindex directives โ€” especially on staging environments that get promoted to production. Item 9 of this checklist exists for exactly this reason. After installing any new app that modifies HTTP headers or injects code into the <head>, run a spot crawl of your top 20 URLs to confirm no noindex was introduced.

MG
Written by

Matt is the founder of RunOctopus. He built All Angles Creatures from zero to page-1 rankings in reptile feeder insects in under 60 days using exactly this method โ€” turning a hard, entrenched niche into RunOctopus's proof store for programmatic SEO and AI search citation.

Connect on LinkedIn →

See what Otto would build for your store

Free architecture preview. No card required. Five minutes.

Generate Preview →