Crawl Budget and Faceted Navigation

Webpresence

Hi, we have an ecommerce website with facetted navigation for the various options available.

Google has 3.4 million webpages indexed. Many of which are over 90% duplicates.

Due to the low domain authority (15/100) Google is only crawling around 4,500 webpages per day, which we would like to improve/increase.

We know, in order not to waste crawl budget we should use the robots.txt to disallow parameter URL’s (i.e. ?option=, ?search= etc..). This makes sense as it would resolve many of the duplicate content issues and force Google to only crawl the main category, product pages etc.

However, having looked at the Google Search Console these pages are getting a significant amount of organic traffic on a monthly basis.

Is it worth disallowing these parameter URL’s in robots.txt, and hoping that this solves our crawl budget issues, thus helping to index and rank the most important webpages in less time.

Or is there a better solution?

Many thanks in advance.

Lee.

jcnotfound2083

Hello, I have also been in a similar situation. What I did was to disallow the urls with parameters using the robots.txt and place (in only the pages with parameters) the following two html tags:

This will expressly indicate to google not to index these pages. I still have some errors but I guess they will disappear in a few months.

Regards

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Crawl Budget and Faceted Navigation

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

URL Capitalization Inconsistencies Registering Duplicate Content Crawl Errors

Will a disclaimer affect Crawling?

How is Google crawling and indexing this directory listing?

Site Navigation

Www vs. non-www differences in crawl errors in Webmaster tools...

When crawls occur - when will my links show up in Open Site Explorer

What is the best tool to crawl a site with millions of pages?

Best way to block a search engine from crawling a link?