Accidentally blocked Googlebot for 14 days

Bull135

Today after I noticed a huge drop in organic traffic to inner pages of my sites, I looked into the code and realized a bug in last commit cause the server to showing captcha pages to all Googlebot requests from Apr 24.

My site has more than 4,000,000 in the index. Before last code change, Googlebot are exempt from being shown the captcha requests so each inner pages are crawled and indexed perfectly with no problem.

The bug broke the whitelisting mechanism and treat requests from Google's ip addresses the same as regular users. It leads to the captcha page being crawled when Googlebot visits thousands of my site's inner pages. This makes Google thinks all my inner pages are identical to each other. Google remove all the inner pages from SERP starting from May 5th before when many of those inner pages have good rankings.

I formerly thought this was a manual or algorithm penalty but

1. I did not receive a warning message in GWT
2. The ranking for main url is good.

I tried with "Fetch as Google" in GWT and realize all Googlebot saw in the past 14 days are the same captcha page for all my inner pages.

Now, I have fixed the bug and updated the production site. I just wanted to ask:

1. How long will it take for Google to remove the "duplicated content" flag on my inner pages and show them in SERP again? From my experience, Googlebot revisits urls quite often. But once a url is flagged as "contains similar content", it could be difficult to recover, is it correct?

2. Besides waiting for Google to update its index, what else can I do right now?

Thanks in advance for your answers.

Bull135

Thanks for the info. My site has current crawl rate at 350,00 pages per day so will take 10-20 days to crawl the entire sites.

Most of organic traffic comes to 10,000 urls while others are pagination urls etc. Now all the traffic 1st inner page of each term disappeared in the results of inurl: command.

EGOL

One of my competitors made this type of error and we figured it out right away when their site dropped from the SERPs. It took them a couple weeks to figure it out and make the change. We were hoping that they never figured it out so we could rake in lots of dough. When they fixed it they were back in the SERPs at full strength within a couple of days.... . but they had 40 indexed pages instead of 4,000,000.

I think you will recover well, but might take a while if you don't have a lot of deep links.

Good luck.

TakeshiYoung

Pretty much all you can do is wait for Google to recrawl your entire site. You can try re-submitting your site in Webmaster Tools (Health -> Fetch As Google). Getting links from other sites will help speed up the crawling as well. Links from social sites like Twitter/Google+ can help with crawling also.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Accidentally blocked Googlebot for 14 days

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Increase in pages crawled per day

Why blocking a subfolder dropped indexed pages with 10%?

Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?

Recovering from Blocked Pages Debaucle

Remove more than 1000 crawl errors from GWT in one day?

Feedburner - Why Sending My Blog Posts A Day After I Post Them?

Robots.txt blocking site or not?

How much javascript does Googlebot read