Massive URL blockage by robots.txt

moneywise_test

Hello people,

In May there has been a dramatic increase in blocked URLs by robots.txt, even though we don't have so many URLs or crawl errors. You can view the attachment to see how it went up. The thing is the company hasn't touched the text file since 2012. What might be causing the problem? Can this result any penalties? Can indexation be lowered because of this?

?di=1113766463681

CleverPhD

Even though there are less pages indexed compared to those that are blocked, you still have a significant increase in indexed pages as well. That is a good thing! You technically have more pages that are indexed than before. It looks like you possibly relaunched the site or something? More pages blocked could be an indexing problem, or it might be a good thing - it all depends on what pages are being blocked.

If you relaunched the site and used this great new whiz-bang CMS that created an online catalog that gave your users 54 ways to sort your product catalog, then the number of "pages" could increase with each sort. Just imagine, sort your widgets by color, or by size or by price, or by price and size, or by size and color, or by color and price - you get the idea. Very quickly you have a bunch of duplicate pages of a single page. If your SEO was on his or her toes, they would account for this using a canonical approach or possibly a meta noindex or changing the robots.txt etc. That would be good as you are not going to confuse Google with all the different versions of the same page.

Ultimately, Shailendra has the approach that you need to take. Look in robots.txt, look at the code on your pages. What happened around 5/26/2013? All those things need to be looked at to try and answer your question.

Chris.Menke

Le Fras,

You don't only have to change the robots.txt file for Google to indicate that more URLs are being blocked by it. The robots.txt file tells the search engines not to crawl given URLs, but that they may keep them in the index and display the URLs in the search results.

So the search engines do know of the URLs that are being blocked and they are able to indicate that more are being blocked as you add pages to your site that are restricted by the robots.txt file.

IM_Learner

Check you robots file. Are there entries to block the crawling? If you can give the url then it would be helpful/

Regards

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Massive URL blockage by robots.txt

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Trailing Slashes on URLs

SEO Best Practices regarding Robots.txt disallow

Application & understanding of robots.txt

Meta robots or robot.txt file?

Two homepage urls

URL or Domain length

Can I use a "no index, follow" command in a robot.txt file for a certain parameter on a domain?

URL Structure for Directory Site