Our Robots.txt and Reconsideration Request Journey and Success
-
We have asked a few questions related to this process on Moz and wanted to give a breakdown of our journey as it will likely be helpful to others!
A couple of months ago, we updated our robots.txt file with several pages that we did not want to be indexed. At the time, we weren't checking WMT as regularly as we should have been and in a few weeks, we found that apparently one of the robots.txt files we were blocking was a dynamic file that led to the blocking of over 950,000 of our pages according to webmaster tools. Which page was causing this is still a mystery, but we quickly removed all of the entries.
From research, most people say that things normalize in a few weeks, so we waited. A few weeks passed and things did not normalize. We searched, we asked and the number of "blocked" pages in WMT which had increased at a rate of a few hundred thousand a week were decreasing at a rate of a thousand a week. At this rate it would be a year or more before the pages were unblocked.
This did not change. Two months later and we were still at 840,000 pages blocked.
We posted on the Google Webmaster Forum and one of the mods there said that it would just take a long time to normalize. Very frustrating indeed considering how quickly the pages had been blocked.
We found a few places on the interwebs that suggested that if you have an issue/mistake with robots.txt that you can submit a reconsideration request. This seemed to be our only hope. So, we put together a detailed reconsideration request asking for help with our blocked pages issue.
A few days later, to our horror, we did not get a message offering help with our robots.txt problem. Instead, we received a message saying that we had received a penalty for inbound links that violate Google's terms of use. Major backfire. We used an SEO company years ago that posted a hundred or so blog posts for us. To our knowledge, the links didn't even exist anymore. They did....
So, we signed up for an account with removeem.com. We quickly found many of the links posted by the SEO firm as they were easily recognizable via the anchor text. We began the process of using removem to contact the owners of the blogs. To our surprise, we got a number of removals right away! Others we had to contact another time and many did not respond at all. Those we could not find an email for, we tried posting comments on the blog.
Once we felt we had removed as many as possible, we added the rest to a disavow list and uploaded it using the disavow tool in WMT. Then we waited...
A few days later, we already had a response. DENIED. In our request, we specifically asked that if the request were to be denied that Google provide some example links. When they denied our request, they sent us an email and including a sample link. It was an interesting example. We actually already had this blog in removem. The issue in this case was, our version was a domain name, i.e. www.domainname.com and the version google had was a wordpress sub domain, i.e. www.subdomain.wordpress.com.
So, we went back to the drawing board. This time we signed up for majestic SEO and tied it in with removem. That added a few more links. We also had records from the old SEO company we were able to go through and locate a number of new links. We repeated the previous process, contacting site owners and keeping track of our progress. We also went through the "sample links" in WMT as best as we could (we have a lot of them) to try to pinpoint any other potentials.
We removed what we could and again, disavowed the rest. A few days later, we had a message in WMT. DENIED AGAIN! This time it was very discouraging as it just didn't seem there were any more links to remove. The difference this time, was that there was NOT an email from Google. Only a message in WMT. So, while we didn't know if we would receive a response, we responded to the original email asking for more example links, so we could better understand what the issue was.
Several days passed we received an email back saying that THE PENALTY HAD BEEN LIFTED! This was of course very good news and it appeared that our email to Google was reviewed and received well.
So, the final hurdle was the reason that we originally contacted Google. Our robots.txt issue. We did not receive any information from Google related to the robots.txt issue we originally filed the reconsideration request for. We didn't know if it had just been ignored, or if there was something that might be done about it. So, as a last ditch final effort, we responded to the email once again and requested help as we did the other times with the robots.txt issue.
The weekend passed and on Monday we checked WMT again. The number of blocked pages had dropped over the weekend from 840,000 to 440,000! Success! We are still waiting and hoping that number will continue downward back to zero.
So, some thoughts:
1. Was our site manually penalized from the beginning, yet without a message in WMT? Or, when we filed the reconsideration request, did the reviewer take a closer look at our site, see the old paid links and add the penalty at that time? If the latter is the case then...
2. Did our reconsideration request backfire? Or, was it ultimately for the best?
3. When asking for reconsideration, make your requests known? If you want example links, ask for them. It never hurts to ask! If you want to be connected with Google via email, ask to be!
4. If you receive an email from Google, don't be afraid to respond to it. I wouldn't over do this or spam them. Keep it to the bare minimum and don't pester them, but if you have something pertinent to say that you have not already said, then don't be afraid to ask.
Hopefully our journey might help others who have similar issues and feel free to ask any further questions.
Thanks for reading!
TheCraig
-
considering this thread has only 36 views I think you should go ahead a post on youmoz, as I think its deservers more exposure ( maybe added pieter point and your warning about not to blindly follow removem)
-
Thanks Paddy! Yeah debated whether to post here or on youmoz... You are probably right.
Thanks for reading!
-
Indeed Pieter! Additionally, removem showed us a LOT of links that "needed" to be removed, that didn't actually need to be removed. It's important to know your backlinks if at all possible and know for yourself which ones are the spammy ones. If we went on what removem told us we should remove, we would have removed WAY more links than we needed to.
Thanks for the response!
-
Another thing: don't trust one tool when having a lot of bad links. removeem.com is only one source where you can find your links.
-
Hopefully I'll never be in the situation you found yourselves in, but a great read and now I know what to expect if I ever do (touch wood).
This might have been better as a youmoz post than a forum post btw.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google robots.txt test - not picking up syntax errors?
I just ran a robots.txt file through "Google robots.txt Tester" as there was some unusual syntax in the file that didn't make any sense to me... e.g. /url/?*
Intermediate & Advanced SEO | | McTaggart
/url/?
/url/* and so on. I would use ? and not ? for example and what is ? for! - etc. Yet "Google robots.txt Tester" did not highlight the issues... I then fed the sitemap through http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php and that tool actually picked up my concerns. Can anybody explain why Google didn't - or perhaps it isn't supposed to pick up such errors? Thanks, Luke0 -
Default Robots.txt in WordPress - Should i change it??
I have a WordPress site as using theme Genesis i am using default robots.txt. that has a line Allow: /wp-admin/admin-ajax.php, is it okay or any problem. Should i change it?
Intermediate & Advanced SEO | | rootwaysinc0 -
Meta Robot Tag:Index, Follow, Noodp, Noydir
When should "Noodp" and "Noydir" meta robot tag be used? I have hundreds or URLs for real estate listings on my site that simply use "Index", Follow" without using Noodp and Noydir. Should the listing pages use these Noodp and Noydr also? All major landing pages use Index, Follow, Noodp, Noydir. Is this the best setting in terms of ranking and SEO. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Manual Penalty Reconsideration Request Help
Hi All, I'm currently in the process of creating a reconsideration request for an 'Impact Links' manual penalty. So far I have downloaded all LIVE backlinks from multiple sources and audited them into groups; Domains that I'm keeping (good quality, natural links). Domains that I'm changing to No Follow (relevant good quality links that are good for the user but may be affiliated with my company, therefore changing the links to no follow rather than removing). Domains that I'm getting rid of. (poor quality sites with optimised anchor text, directories, articles sites etc.). One of my next steps is to review every historical back link to my website that is NO LONGER LIVE. To be thorough, I have planned to go through every domain (even if its no longer linking to my site) that has previously linked and straight up disavow the domain (if its poor quality).But I want to first check whether this is completely necessary for a successful reconsideration request? My concerns are that its extremely time consuming (as I'm going through the domains to avoid disavowing a good quality domain that might link back to me in future and also because the historical list is the largest list of them all!) and there is also some risk involved as some good domains might get caught in the disavowing crossfire, therefore I only really want to carry this out if its completely necessary for the success of the reconsideration request. Obviously I understand that reconsideration requests are meant to be time consuming as I'm repenting against previous SEO sin (and believe me I've already spent weeks getting to the stage I'm at right now)... But as an in house Digital Marketer with many other digital avenues to look after for my company too, I can't justify spending such a long time on something if its not 100% necessary. So overall - with a manual penalty request, would you bother sifting through domains that either don't exist anymore or no longer link to your site and disavow them for a thorough reconsideration request? Is this a necessary requirement to revoke the penalty or is Google only interested in links that are currently or recently live? All responses, thoughts and ideas are appreciated 🙂 Kind Regards Sam
Intermediate & Advanced SEO | | Sandicliffe0 -
Disallow my store in robots.txt?
Should I disallow my store directory in robots.txt? Here is the URL: https://www.stdtime.com/store/ Here are my reasons for suggesting this: SEOMOZ finds crawl "errors" in there that I don't care about I don't think I care if the search engines index those pages I only have one product, and it is not an impulse buy My product has a 60 day sales cycle, so price is less important than features
Intermediate & Advanced SEO | | raywhite0 -
10,000 New Pages of New Content - Should I Block in Robots.txt?
I'm almost ready to launch a redesign of a client's website. The new site has over 10,000 new product pages, which contain unique product descriptions, but do feature some similar text to other products throughout the site. An example of the page similarities would be the following two products: Brown leather 2 seat sofa Brown leather 4 seat corner sofa Obviously, the products are different, but the pages feature very similar terms and phrases. I'm worried that the Panda update will mean that these pages are sand-boxed and/or penalised. Would you block the new pages? Add them gradually? What would you recommend in this situation?
Intermediate & Advanced SEO | | cmaddison0 -
Blocking Dynamic URLs with Robots.txt
Background: My e-commerce site uses a lot of layered navigation and sorting links. While this is great for users, it ends up in a lot of URL variations of the same page being crawled by Google. For example, a standard category page: www.mysite.com/widgets.html ...which uses a "Price" layered navigation sidebar to filter products based on price also produces the following URLs which link to the same page: http://www.mysite.com/widgets.html?price=1%2C250 http://www.mysite.com/widgets.html?price=2%2C250 http://www.mysite.com/widgets.html?price=3%2C250 As there are literally thousands of these URL variations being indexed, so I'd like to use Robots.txt to disallow these variations. Question: Is this a wise thing to do? Or does Google take into account layered navigation links by default, and I don't need to worry. To implement, I was going to do the following in Robots.txt: User-agent: * Disallow: /*? Disallow: /*= ....which would prevent any dynamic URL with a '?" or '=' from being indexed. Is there a better way to do this, or is this a good solution? Thank you!
Intermediate & Advanced SEO | | AndrewY1 -
Block all search results (dynamic) in robots.txt?
I know that google does not want to index "search result" pages for a lot of reasons (dup content, dynamic urls, blah blah). I recently optimized the entire IA of my sites to have search friendly urls, whcih includes search result pages. So, my search result pages changed from: /search?12345&productblue=true&id789 to /product/search/blue_widgets/womens/large As a result, google started indexing these pages thinking they were static (no opposition from me :)), but i started getting WMT messages saying they are finding a "high number of urls being indexed" on these sites. Should I just block them altogether, or let it work itself out?
Intermediate & Advanced SEO | | rhutchings0