The "webmaster" disallowed all ROBOTS to fight spam! Help!!
-
One of the companies I do work for has a magento site. I am simply the SEO guy and they work the website through some developers who hold access to their systems VERY tightly. Using Google Webmaster Tools I saw that the robots.txt file was blocking ALL robots.
I immediately e-mailed out and received a long reply about foreign robots and scrappers slowing down the website. They told me I would have to provide a list of only the good robots to allow in robots.txt.
Please correct me if I'm wrong.. but isn't Robots.txt optional?? Won't a bad scrapper or bot still bog down the site? Shouldn't that be handled in httaccess or something different?
I'm not new to SEO but I'm sure some of you who have been around longer have run into something like this and could provide some suggestions or resources I could use to plead my case!
If I'm wrong.. please help me understand how we can meet both needs of allowing bots to visit the site but prevent the 'bad' ones. Their claim is the site is bombarded by tons and tons of bots that have slowed down performance.
Thanks in advance for your help!
-
Thanks for the suggestions!! I'll keep you updated.
-
You can get the list of good robots from the list at Robotstxt.org: http://www.robotstxt.org/db.html.
I'd recommend creating an edited version of the robots.txt file yourself, specifically Allowing googlebot and others. Then send that with a link to the robotstxt.org site.
You may need to get the business owners involved. IT exists to enable the business, not strap it down so it can't move.
-
What you could do is just add Allow statements for the different Googlebots and the bots of other search engines. This will probably make the developers happy so they can keep other bots out of the door (although I doubt this would work and definitely don't think that this should be the option to keep spammers away, but that says more about the quality of development ;-)).
-
Yes, there are a ton of bad bots one may want to block. Can you show us the robots.txt file? If they aren't blocking legit search engine bots, you're probably okayish. If they are actually blocking all bots, you have cause for concern.
Can you give us a screenshot from GWT?
I use a program called Screaming Frog daily. It's not malicious, off the shelf. I just want to crawl and gather meta data. I can tell it to disregard robots.txt. It will crawl a site until it hit's something password protected. There's not much any robots.txt can do about it, as it can also spoof user agents.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are "feed" Backlinks an issue? - Vigorous Fickle in the rankings in past two months
Hi All! I have been observing a vigorous fickle in my rankings since past two months. Some first page keywords have moved to the second page. Some of my observations from the backlink audit rose below questions: Q1. Are large # of backlinks from "feed URLs" harmful in any way? If yes?
Intermediate & Advanced SEO | | Ishrat-Khan
Q2. Am I supposed to get webmasters take these down or block their own feed URL?
Q3. The backlinks come in huge numbers from reliable websites. Do I need to remove the backlinks just because of the huge number?
Q4. What factors to look for if rankings started fluctuating in past 2 months? Note: these backlinks from "feed" are from the websites who posted our editorials. Backlink Example: http://xyz.com/categories/abc/feed/0 -
Articles marked with "This site may be hacked," but I have no security issues in the search console. What do I do?
There are a number of blog articles on my site that have started receiving the "This site may be hacked" warning in the SERP. I went hunting for security issues in the Search Console, but it indicated that my site is clean. In fact, the average position of some of the articles has increased over the last few weeks while the warning has been in place. The problem sounds very similar to this thread: https://productforums.google.com/forum/#!category-topic/webmasters/malware--hacked-sites/wmG4vEcr_l0 but that thread hasn't been touched since February. I'm fearful that the Google Form is no longer monitored. What other steps should I take? One query where I see the warning is "Brand Saturation" and this is the page that has the warning: http://brolik.com/blog/should-you-strive-for-brand-saturation-in-your-marketing-plan/
Intermediate & Advanced SEO | | Liggins0 -
Why are "noindex" pages access denied errors in GWT and should I worry about it?
GWT calls pages that have "noindex, follow" tags "access denied errors." How is it an "error" to say, "hey, don't include these in your index, but go ahead and crawl them." These pages are thin content/duplicate content/overly templated pages I inherited and the noindex, follow tags are an effort to not crap up Google's view of this site. The reason I ask is that GWT's detection of a rash of these access restricted errors coincides with a drop in organic traffic. Of course, coincidence is not necessarily cause. Should I worry about it and do something or not? Thanks... Darcy
Intermediate & Advanced SEO | | 945010 -
I need help!
Please select best theme for me, from the below themes. I want to create printing website. I select these themes after long research. http://us-themes.com/demo/?theme=ImprezaWP http://themeforest.net/item/salbii-responsive-multipurpose-wordpress-theme/full_screen_preview/6265876 http://themeforest.net/item/codeus-multipurpose-responsive-wordpress-theme/full_screen_preview/6906054
Intermediate & Advanced SEO | | AlexanderWhite0 -
Webmaster Tools (Urgent)
So yesterday google webmaster tools has over 5,000 links linking to my site. I get in this morniing and now i have 16 links linking to my site and no rankings minus brand terms. I do not believe that I have been penalized but I might have been. After digging further into this it seems that my www.domain.com and domain.com are separated and webmaster tools is tracking www.domain.com and majority of links are to domain.com. Is this possible or am I wishing to see something that is not there. Any help and recommendations would be absolutely appreciated.
Intermediate & Advanced SEO | | Asher0 -
Is "Car Discount" a problematic anchor text for CarDiscount.com (google penguin)?
I have a couple of partial match domains in the format KEYOWRDdiscount.com and also the website name resembles domain name. "Car Discount" is not my website but just an example to illustrate:
Intermediate & Advanced SEO | | lcourse
Is "Car Discount" a problematic anchor text for CarDiscount.com?
Should I try to modify existing external anchor texts to "CarDiscount" or "CarDiscount.com" instead of "Car Discount" Do you know of any cases where such anchor texts coinciding with partial match domain were likely reason for penguin penalization? Thanks.0 -
Help Needed With .htaccess RewriteRule
Hello Fellow Mozzers, I would really appreciate a little help as I have been banging my head against a wall for the last few hours trying to create a .htaccess RewriteRule. I have around 300 URLs that I need to 301 redirect following a site re-build, they are in groups of similar urls but infortunately not broken down in to folders. Here is an example of a few URLs:- https://www.domain.co.uk/chamaecyparis-lawsoniana-ellwoodii_2.htm
Intermediate & Advanced SEO | | AdeLewis
https://www.domain.co.uk/chamaecyparis-lawsoniana-ellwoodii-200225cm-6670.htm
https://www.domain.co.uk/chamaecyparis-lawsoniana-ellwoodii.htm
https://www.domain.co.uk/chamaecyparis-lawsoniana-ellwoods-gold.htm
https://www.domain.co.uk/chamaecyparis-lawsoniana-lemon-queen.htm
https://www.domain.co.uk/chamaecyparis-lawsoniana-yvonne-200cm-66.htm All of the above URLs need to redirect to a new URL:- http://www.domain.co.uk/chamaecyparis-lawsoniana Here is the RewriteRule that I currently have but it isn't working:- RewriteEngine On
RewriteRule ^(.*)_chamaecyparis-lawsoniana$ https://www.domain.co.uk/chamaecyparis-lawsoniana [R=301,L] Anyone have any suggestions? Thanks
Ade.0 -
Prowling report says "duplicate titles" for wp-login.php
Hi there, How are you guys doing? I have a quick question. The last prowling report we received said we have three pages with "duplicate titles". Those three pages are: /wp-login.php wp-login.php?action=lostpassword /wp-login.php?action=register I'm a little confused because those pages don't even have a title. Do you think it's a big deal? Also do you have any idea of why the prowling report says those pages have duplicate titles? Apparently, wp-login.php is part of the Wordpress core. It's a built-in page that
Intermediate & Advanced SEO | | Ericc22
handles login and registration. Not something we can edit. Thanks a lot and have a nice day!0