How to Use This Duplicate Content Audit Checklist
Duplicate content on an ecommerce store erodes crawl budget, splits link equity across redundant URLs, and suppresses rankings for pages that should dominate search results. This checklist covers the 12 most common duplication failure points specific to ecommerce โ from faceted navigation to manufacturer descriptions โ each with a concrete pass/fail test you can run today.
Work through each item using a combination of Google Search Console, a site-crawl tool such as Screaming Frog or Sitebulb, and manual URL inspection. Flag every failing item as a remediation task with a priority level: critical (directly harms rankings), moderate (wastes crawl budget), or low (clean-up only).
Checklist Items 1โ4: URL Structure and Canonicalization
**1. Canonical tags on all product and category pages.** PASS: Every indexable product and category URL contains a self-referencing canonical tag in the `<head>`. FAIL: Any page is missing a canonical, has a canonical pointing to a 404, or has conflicting canonical signals (e.g., canonical says URL-A but sitemap lists URL-B).
**2. HTTP vs. HTTPS duplication.** PASS: All HTTP versions of URLs return a 301 redirect to the HTTPS equivalent; no HTTP page is indexable. FAIL: Both HTTP and HTTPS versions of any page return a 200 status and are accessible to crawlers.
**3. Trailing-slash consistency.** PASS: `/category/shoes` and `/category/shoes/` both resolve to one canonical version via 301; the other returns a redirect, not a 200. FAIL: Both versions return 200 and serve the same content without a canonical relationship.
**4. WWW vs. non-WWW duplication.** PASS: One preferred domain (www or non-www) is set in Search Console; the other redirects with a 301 sitewide. FAIL: Both `www.example.com` and `example.com` return 200 responses with identical content.
Checklist Items 5โ7: Faceted Navigation and Filters
**5. Faceted navigation URL handling.** PASS: Filter and sort parameters (e.g., `?color=red&size=M`) are either blocked via `robots.txt`, set to `noindex`, or consolidated under a canonical pointing to the base category URL. FAIL: Parameter-generated URLs are indexable, return 200, and produce pages with near-identical content to the base category.
**6. Sort-order parameter exclusion.** PASS: URLs containing sort parameters such as `?sort=price_asc` are excluded from indexing via canonical or `noindex` tags. FAIL: Sort-order variants are crawlable and indexed, fragmenting ranking signals for the base category page.
**7. Pagination canonicalization.** PASS: Paginated category pages (`/category/shoes?page=2`) either carry a self-referencing canonical (acceptable) or point back to page 1 (only appropriate when page 1 contains all significant content). FAIL: Paginated pages are indexed without canonical tags and compete with the root category URL for the same keyword cluster.
Checklist Items 8โ10: Product Page Content
**8. Manufacturer or supplier description reuse.** PASS: Every product description is rewritten in original language; no block of text longer than two sentences matches the manufacturer's published copy verbatim. FAIL: Product descriptions are copy-pasted from supplier feeds or manufacturer spec sheets, duplicating content that appears on dozens or hundreds of other retail sites.
**9. Product variant pages (size, color, etc.).** PASS: Each product variant either (a) has a unique URL with a canonical pointing to the master product page, or (b) is served on a single URL with variant selection handled via JavaScript without producing separate indexable URLs. FAIL: Size or color variants generate separate indexable URLs with identical titles, descriptions, and body copy.
**10. Thin or templated product pages.** PASS: Every product page contains at least one unique content element beyond the product name, price, and image โ such as a unique description, specifications table, or user-generated review content. FAIL: A significant portion of product pages share the same boilerplate description template with only the product name swapped in.
Checklist Items 11โ12: Site Architecture and Cross-Domain Issues
**11. Subdomain or staging site duplication.** PASS: Staging, development, or regional subdomains (e.g., `staging.example.com`, `uk.example.com`) are either fully blocked via `robots.txt` disallow or password-protected so they are not accessible to crawlers. FAIL: Any non-production version of the site is publicly crawlable and returns 200 responses for pages that mirror the live store.
**12. Syndicated or cross-posted content.** PASS: Any product descriptions, blog posts, or category copy that appears on partner sites, marketplaces, or press outlets carries a canonical tag pointing back to the original URL on the primary store domain. FAIL: The same content exists on an external domain without a canonical attribution, and the external version ranks ahead of or competes directly with the store's own URL.
Prioritizing Fixes After the Audit
Sort failures into three tiers. Critical fixes โ missing canonicals, indexable staging sites, and HTTP/HTTPS duplication โ address immediately because they directly dilute ranking signals for revenue-driving pages. Moderate fixes โ faceted navigation, variant page handling, and supplier descriptions โ schedule within the next sprint cycle, as they compound crawl waste over time.
Low-priority fixes โ trailing-slash inconsistencies on low-traffic pages and syndicated blog content โ batch into a quarterly cleanup. Document every fix with before/after URLs and re-crawl the affected sections within 30 days to confirm resolution. Use Google Search Console's URL Inspection tool to verify that corrected canonicals are recognized by Google's indexing system.