The "webmaster" disallowed all ROBOTS to fight spam! Help!!
-
One of the companies I do work for has a magento site. I am simply the SEO guy and they work the website through some developers who hold access to their systems VERY tightly. Using Google Webmaster Tools I saw that the robots.txt file was blocking ALL robots.
I immediately e-mailed out and received a long reply about foreign robots and scrappers slowing down the website. They told me I would have to provide a list of only the good robots to allow in robots.txt.
Please correct me if I'm wrong.. but isn't Robots.txt optional?? Won't a bad scrapper or bot still bog down the site? Shouldn't that be handled in httaccess or something different?
I'm not new to SEO but I'm sure some of you who have been around longer have run into something like this and could provide some suggestions or resources I could use to plead my case!
If I'm wrong.. please help me understand how we can meet both needs of allowing bots to visit the site but prevent the 'bad' ones. Their claim is the site is bombarded by tons and tons of bots that have slowed down performance.
Thanks in advance for your help!
-
Thanks for the suggestions!! I'll keep you updated.
-
You can get the list of good robots from the list at Robotstxt.org: http://www.robotstxt.org/db.html.
I'd recommend creating an edited version of the robots.txt file yourself, specifically Allowing googlebot and others. Then send that with a link to the robotstxt.org site.
You may need to get the business owners involved. IT exists to enable the business, not strap it down so it can't move.
-
What you could do is just add Allow statements for the different Googlebots and the bots of other search engines. This will probably make the developers happy so they can keep other bots out of the door (although I doubt this would work and definitely don't think that this should be the option to keep spammers away, but that says more about the quality of development ;-)).
-
Yes, there are a ton of bad bots one may want to block. Can you show us the robots.txt file? If they aren't blocking legit search engine bots, you're probably okayish. If they are actually blocking all bots, you have cause for concern.
Can you give us a screenshot from GWT?
I use a program called Screaming Frog daily. It's not malicious, off the shelf. I just want to crawl and gather meta data. I can tell it to disregard robots.txt. It will crawl a site until it hit's something password protected. There's not much any robots.txt can do about it, as it can also spoof user agents.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pages blocked by robots
**yazılım sürecinde yapılan bir yanlışlıktı.** Sorunu hızlı bir şekilde nasıl çözebilirim? bana yardım et. ```[XTRjH](https://imgur.com/a/XTRjH)
Intermediate & Advanced SEO | | mihoreis0 -
Ranking drop for "Mobile" devices category in Google webmaster tools
Hi, Our rank dropped and we noticed it's a major drop in "Mobile" devices category, which is contributing to the overall drop. What exactly drops mobile rankings? We do not have any messages in search console. We have made few redirects and removed footer links. How these affect? Thanks,
Intermediate & Advanced SEO | | vtmoz
Satish0 -
"No Index, No Follow" or No Index, Follow" for URLs with Thin Content?
Greetings MOZ community: If I have a site with about 200 thin content pages that I want Google to remove from their index, should I set them to "No Index, No Follow" or to "No Index, Follow"? My SEO firm has advised me to set them to "No Index, Follow" but on a recent MOZ help forum post someone suggested "No Index, No Follow". The MOZ poster said that telling Google the content was should not be indexed but the links should be followed was inconstant and could get me into trouble. This make a lot of sense. What is proper form? As background, I think I have recently been hit with a Panda 4.0 penalty for thin content. I have several hundred URLs with less than 50 words and want them de-indexed. My site is a commercial real estate site and the listings apparently have too little content. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
The same "About" page on multiple WordPress microsites
Hello there, I have over 10 WordPress websites that all have the same "About" page because the same company. I have concerns that this will adversely affect my sites and i'm looking for the best way to deal with this. I was either going to remove the "About" page with Google Webmaster Tools and robots.txt or use the canonical meta tag on that page. Any thoughts?
Intermediate & Advanced SEO | | SpaMedica0 -
**and spam**
Hi, The web-designer would like to make us use this CSS-class: or **, in order to keep all over the same size of the titles on a page. Is it a problem for the search engine (like spam)? I haven’t ever use this CSS-class and that’s why I can’t see if it is good or bad. Has someone already used it? Thank you in advance!**
Intermediate & Advanced SEO | | ISA-AU0 -
"nocontent" class use for Google Custom Search: SEO Ramifications?
Hi all, Have a client that uses Google Custom Search tool which is crawling, indexing and returning millions of irrelevant results for keywords that are on every page of the site. IT/Web dev. team is considering adding a class attribute to prohibit Google Custom Search from indexing bolierplate content regions. Here's the link to Google's custom search help page: http://support.google.com/customsearch/bin/answer.py?hl=en&answer=2364585 "...If your pages have regions containing boilerplate content that's not relevant to the main content of the page, you can identify it using the nocontent class attribute. When Google Custom Search sees this tag, we'll ignore any keywords it contains and won't take them into account when calculating ranking for your Custom Search engine. (We'll still follow and crawl any links contained in the text marked nocontent.) To use the nocontent class attribute, include the boilerplate content in a tag (for example, span or div) like this: Google Custom Search also notes:"Using nocontent won't impact your site's performance in Google Web Search, or our crawling of your site, in any way. We'll continue to follow any links in tagged content; we just won't use keywords to calculate ranking for your Custom Search engine."Just want to confirm if anyone can forsee any SEO implications the use of this div could create? Anyone have experience with this?Thank you!
Intermediate & Advanced SEO | | MRM-McCANN0 -
Why is noindex more effective than robots.txt?
In this post, http://www.seomoz.org/blog/restricting-robot-access-for-improved-seo, it mentions that the noindex tag is more effective than using robots.txt for keeping URLs out of the index. Why is this?
Intermediate & Advanced SEO | | nicole.healthline0