For business leaders
Want to capture more product searches without the technical complexity?
See how retailers unlock hidden revenue from filter combinations automatically.
What is faceted navigation?
Faceted navigation lets users filter products by multiple attributes at once (color, size, brand, price range, material). It's essential for user experience on any e-commerce site with more than a few hundred products.
The problem? Each filter combination can generate a unique URL. A site with 10 filterable attributes, each with 10 options, could theoretically create over 10 billion unique URLs. Even a modest filter setup can produce hundreds of thousands of pages that Google tries to crawl.
Google's own documentation calls faceted navigation “the most common source of overcrawl issues.” In most cases, the problem could have been avoided by following best practices.
The trade-off every e-commerce site faces
Most teams end up at one extreme or the other. Neither approach captures the revenue opportunity.
Block everything
- ×Miss long-tail searches like “blue velvet sofas under £2000”
- ×Leave revenue on the table from high-intent queries
- ×Competitors with better filter pages outrank you
- ×No visibility in AI search tools that expand queries
Index everything
- ×Crawl budget wasted on redundant filter combinations
- ×Duplicate content dilutes ranking signals
- ×Internal link equity spread too thin
- ×Thin pages with no unique content indexed
The solution: index filter combinations that have search demand and provide unique value. Block everything else.
When to index a filter combination
A filtered page should only be indexed if it meets all of these criteria. If any fails, block or canonical the URL.
Search demand exists
Keyword research shows people actually search for this combination. 'Blue twin comforters' has volume; 'blue size-8 cotton blend t-shirts sorted by price' doesn't.
Unique user intent
The filtered page serves a meaningfully different need than its parent category. 'Women's running shoes' differs from 'shoes'; 'shoes sorted by newest' doesn't.
Sufficient products
The filter returns enough products to be useful (typically 10+). Empty or near-empty filter results should return 404s, not thin indexed pages.
Conversion potential
The query indicates purchase intent. 'Leather office chairs' signals buying mode; 'office chair reviews' might not.
Unique content opportunity
You can add valuable content beyond the filtered product list: buying guides, size charts, material comparisons, FAQs.
Stable over time
The page represents a durable category, not a temporary state. 'In stock items' changes constantly; 'women's winter boots' is stable.
How to handle each filter type
Different filters have different SEO value. Here's the general guidance; always validate with your own keyword data.
Often worth indexing
Brand + Category
“Nike running shoes”, “Herman Miller office chairs”
Material + Product type
“Leather sofas”, “Cotton bedding”
Specific size combinations
“Women's boots size 6”, “Twin comforter sets”
Style + Category
“Mid-century modern furniture”, “Minimalist desk lamps”
Colour + high-demand product
“Blue velvet sofas”, “White kitchen cabinets”
Use case filters
“Outdoor dining furniture”, “Gaming monitors”
Block or noindex
Sort parameters
?sort=price-low, ?sort=newest: never create unique pages
Session/tracking IDs
?sessionid=abc123: creates infinite URL variations
Price range filters
?price=50-100: too dynamic, little search demand
Availability filters
?in_stock=true: temporary states that change constantly
3+ filter combinations
Too specific, too thin, rarely searched
Pagination with filters
?color=red&page=5: canonical to first page
Technical implementation strategies
There are several ways to control how search engines handle filtered URLs. Use the right tool for each scenario.
robots.txt blocking
Prevents crawling entirely. Best for parameters you never want indexed: sort orders, session IDs, 3+ filter combinations. Most efficient for preserving crawl budget.
Disallow: /*?sort= Disallow: /*?sessionid= Disallow: /*/filters/
Canonical tags
Consolidates ranking signals to a preferred URL. Use when filtered pages can be crawled but shouldn't rank independently. Point low-value filters to their parent category.
<!-- On /shoes?color=red --> <link rel="canonical" href="https://example.com/shoes" />
noindex, follow
Allows crawling (for link equity) but prevents indexation. Use sparingly; it still consumes crawl budget. Over time, Google may reduce crawling of noindexed pages.
<meta name="robots" content="noindex, follow">
URL fragments (JavaScript filtering)
Content after # is ignored by search engines. Use for presentation-only filters that shouldn't create separate URLs at all.
/shoes#color=red&size=10 // Google only sees /shoes
Self-referencing canonicals
For high-value filter pages you want to rank. The page canonicals to itself, signalling it deserves independent indexation.
<!-- On /shoes/womens-running -->
<link rel="canonical"
href="https://example.com/shoes/
womens-running" />404 for empty results
Google explicitly recommends returning 404 status codes when filter combinations produce no results. Don't redirect or serve soft 404s.
// If filter returns 0 products return res.status(404)
URL structure best practices
How you structure filter URLs affects both crawlability and user experience.
Good practices
- Use consistent parameter ordering (alphabetical)
- Standard & separator for parameters
- Clean, readable paths for high-value filters (/shoes/womens-running)
- Normalise URLs server-side to prevent duplicates
- Use static paths for index-worthy filters
Avoid these mistakes
- ×Random parameter ordering creating duplicate URLs
- ×Non-standard separators (commas, semicolons, brackets)
- ×Infinite URL depth with no crawl limits
- ×Allowing both /shoes?color=red and /shoes?color=red
- ×Encoding the same filter in multiple URL formats
The Wayfair model: tiered URL depth
Wayfair uses path-based filtering with clear depth limits: /sb1/twin-comforters (one filter, indexed), /sb2/blue-twin-comforters (two filters, indexed), /filters/blue-twin-cotton-comforters (three+ filters, blocked via robots.txt). This captures moderate-specificity queries without index bloat.
Filter pages need more than filtered products
A filtered page that just shows a subset of products is thin content. Google increasingly devalues pages that don't provide unique value beyond what the parent category offers.
For filter pages you want to rank, add content that helps users make decisions:
- Unique title and H1 that match the search query
- Descriptive copy explaining what makes this category special
- Buying guides or feature explanations
- FAQs addressing common questions about the filter type
- Related category links helping users refine or expand their search
This is where most teams get stuck. Creating unique content for thousands of filter combinations isn't feasible manually.
How Similar AI approaches this differently
Rather than trying to manage faceted navigation (blocking some URLs, canonicalising others, and hoping you got the trade-offs right), Similar AI takes a different approach.
The research engine identifies which filter combinations have actual search demand. Instead of indexing dynamic filtered URLs, the page creation agent builds dedicated category pages for those high-value combinations.
Each page gets unique content generated by AI: not just a product grid, but helpful copy, relevant links, and optimized metadata. The result is that you capture long-tail demand without the crawl budget problems of traditional faceted navigation.
Common faceted navigation mistakes
These are the issues we see most often when auditing e-commerce sites.
Relying on rel="nofollow" to control crawling
Google treats nofollow as a hint, not a directive. If Google finds the URL another way, it may still crawl and index it. Use robots.txt for reliable blocking.
Canonical tags pointing to noindexed pages
This creates conflicting signals. Google doesn't know if you want the page indexed or not. If the canonical target is noindexed, the whole cluster may drop from the index.
Inconsistent URL parameter handling
If /shoes?color=red and /shoes?color=red both exist, you've created duplicate content. Normalise spelling and parameter order server-side.
Blocking everything and hoping for the best
You're leaving long-tail traffic on the table. Competitors who create dedicated pages for 'women's running shoes size 7' will outrank your blocked filter URLs.
Letting JavaScript render filtered URLs
If your client-side filtering generates URLs, search engines will try to crawl them. Use URL fragments (#) for JavaScript filters, or ensure AJAX filtering doesn't create new URLs.
Forgetting about internal linking
If you link to filtered URLs from your main navigation, you're signalling importance. Link only to canonical versions of high-value categories.
Frequently asked questions
What happened to Google's URL Parameters tool?
Google deprecated the URL Parameters tool in Google Search Console in 2024. The recommendation now is to handle parameter management server-side through robots.txt, canonical tags, and proper URL structure.
Should I use path-based or query parameter filtering?
For filters you want indexed, path-based URLs (/shoes/womens-running) look cleaner and signal standalone pages. For filters you don't want indexed, query parameters (?sort=price) make blocking easier. Many sites use a hybrid: paths for high-value filters, parameters for low-value ones.
How many filter combinations is too many?
There's no magic number, but if you're creating more than 2-3x your actual product count in indexable URLs, you're likely over-indexing. A site with 10,000 products shouldn't have 500,000 indexed filter pages.
What about AI search tools like ChatGPT and Perplexity?
AI search tools expand user queries into multiple searches, making long-tail filter combinations more valuable. A question like 'best office chair for back pain under $500' might trigger searches for multiple specific filter combinations. Having dedicated pages for these increases your visibility.
How do I know which filter combinations have search demand?
Use keyword research tools to check volume for filter combinations. Look at Search Console data for impressions on existing filter URLs. Check competitor rankings; if they have dedicated pages ranking for filter combinations, there's demand.
Can I use AJAX filtering without creating SEO problems?
Yes, if done correctly. Use URL fragments (#) instead of query parameters for AJAX filters, so no new URLs are created. Or use history.pushState() for user-friendly URLs while serving the same canonical page content. The key is preventing URL proliferation.
Related capabilities
How Similar AI helps e-commerce brands capture demand from filter-style searches.
What this looks like in practice
Visual Comfort, a premium lighting retailer, used Similar AI to capture demand from product searches they were missing.
“The Similar AI platform's ability to swiftly align with our changing site experience is invaluable. The extra analytical power and proactive insights provided by Similar AI have been essential for our lean team.”
Jennifer Skeen
VP of eCommerce, Visual Comfort
Stop managing faceted navigation. Start capturing demand.
Similar AI identifies the category pages your site is missing and creates them with the content and structure search engines reward.