Blocking all robots except rogerbot
-
I'm in the process of working with a site under development and wish to run the SEOmoz crawl test before we launch it publicly. Unfortunately rogerbot is reluctant to crawl the site. I've set my robots.txt to disallow all bots besides rogerbot.
Currently looks like this:
User-agent: * Disallow: / User-agent: rogerbot Disallow:
All pages within the site are meta tagged index,follow.
Crawl report says:
Search Engine blocked by robots.txt Yes
Am I missing something here?
-
...actually I take that back. Still reporting as blocked by robots.txt.
Going to email the team.
-
Thanks, it appears to be crawling without issue now
-
And if that still doesn't work, email help@seomoz.org and they'll help you figure out the right way to let Roger in while excluding everyone else.
-
You've made it upside down
Roger sees the first * and then goes "okay :(" and goes away.
Simply change it to:
User-agent: rogerbot
Disallow:User-agent: *
Disallow: /
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Our crawler was not able to access the robots.txt file on your site.
Good morning, Yesterday, Moz gave me an error that is wasn't able to find our robots.txt file. However, this is a new occurrence, we've used Moz and its crawling ability many times prior; not sure why the error is happening now. I validated that the redirects and our robots page are operational and nothing is disallowing Roger in our robots.txt. Any advice or guidance would be much appreciated. https://www.agrisupply.com/robots.txt Thank you for your time. -Danny
Moz Pro | | Danny_Gallagher0 -
Website blocked by Robots.txt in OSE
When viewing my client's website in OSE under the Top Pages tab, it shows that ALL pages are blocked by Robots.txt. This is extremely concerning because Google Webmaster Tools is showing me that all pages are indexed and OK. No crawl errors, no messages, no nothing. I did a "site:website.com" in Google and all of the pages of the website returned. Any thoughts? Where is OSE picking up this signal? I cannot find a blocked robots tag in the code or anything.
Moz Pro | | ConnellyPartners0 -
Blocked by Meta Robots.
Hi, I get this warning on my reporting. Blocked by Meta Robots - This page is being kept out of the search engine indexes by meta-robots. what does that means ? and how to solve that, if i using wordpress as my website engine. and about rel=canonical , in which page I should put this tag, in original page, or in copy page ? thanks for all of your answer, it will be means a lot
Moz Pro | | theconversion0 -
Allow only Rogerbot, not googlebot nor undesired access
I'm in the middle of site development and wanted to start crawling my site with Rogerbot, but avoid googlebot or similar to crawl it. Actually mi site is protected with login (basic Joomla offline site, user and password required) so I thought that a good solution would be to remove that limitation and use .htaccess to protect with password for all users, except Rogerbot. Reading here and there, it seems that practice is not very recommended as it could lead to security holes - any other user could see allowed agents and emulate them. Ok, maybe it's necessary to be a hacker/cracker to get that info - or experienced developer - but was not able to get a clear information how to proceed in a secure way. The other solution was to continue using Joomla's access limitation for all, again, except Rogerbot. Still not sure how possible would that be. Mostly, my question is, how do you work on your site before wanting to be indexed from Google or similar, independently if you use or not some CMS? Is there some other way to perform it?
Moz Pro | | MilosMilcom
I would love to have my site ready and crawled before launching it and avoid fixing issues afterwards... Thanks in advance.0 -
Meta-Robots noFollow and Blocked by Meta-Robots
On my most recent campaign report, I have 2 Notices that we can't find any cause for: Meta-Robots nofollow-
Moz Pro | | gfiedel
http://www.fateyes.com/the-effect-of-social-media-on-the-serps-social-signals-seo/?replytocom=92
"noindex nofollow" for the page: http://www.fateyes.com/the-effect-of-social-media-on-the-serps-social-signals-seo/ Blocked by Meta-Robots -Meta-Robots nofollow-
http://www.fateyes.com/the-effect-of-social-media-on-the-serps-social-signals-seo/?replytocom=92
"noindex nofollow" for the page: http://www.fateyes.com/the-effect-of-social-media-on-the-serps-social-signals-seo/ We are unable to locate any code whatsoever that may explain this. Any ideas anyone?0 -
How to get rid of the message "Search Engine blocked by robots.txt"
During the Crawl Diagnostics of my website,I got a message Search Engine blocked by robots.txt under Most common errors & warnings.Please let me know the procedure by which the SEOmoz PRO Crawler can completely crawl my website?Awaiting your reply at the earliest. Regards, Prashakth Kamath
Moz Pro | | 1prashakth0 -
RogerBot does not respect some rules??
Hello; Every week when I see my stats I notice that RogerBot has crawled 10000 form my website, even pages with a no index or not allowed in the robots.txt. Is it possible to avoid him from crawling the these pages? They are form pages in my site, with are not indexed by google, they have a noindex and they are not allowed for crawling in the robots.txt. Thanks everyone for your help!!!
Moz Pro | | jgomes0 -
Is there a whitelist of the RogerBot IP Addresses?
I'm all for letting Roger crawl my site, but it's not uncommon for malicious spiders to spoof the User-Agent string. Having a whitelist of Roger's IP addresses would be immensely useful!
Moz Pro | | EricCholis1