Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What is the full user-agent for rogerbot?
-
IT is blocking AWS via a proxy in front of our server. We've tried allowing the "roberbot" user-agent but crawling functionality still isn't working in my Moz Pro account. Is there a more specific user-agent we can allow in our proxy software? Thank you.
-
We are having this same issue. I was hoping someone clarified it for you.
Did you ever get it sorted out?
-
User-agent: *
User-agent: dotbot
Disallow: /
User-agent: rogerbot
Disallow: /if you want to prevent robots from crawling your site truly prevent them you will need to use either a password restriction or a tool similar to this
http://www.distilnetworks.com/
If you see what is being said by Google and Moz a robots.txt file can not guarantee blocking something that is linked to. if you want to do that you will have to block the referral using a WAF like distilnetworks
http://moz.com/help/guides/search-overview/crawl-diagnostics
https://moz.com/researchtools/ose/dotbot
&
https://support.google.com/webmasters/answer/6062608?rd=2
Also blocking link analysis user agents that are nothing but a drain on your resources is a good idea. Simple enough to do in htaccess with something like this:
Search Engine Blocked by Robots.txt
This page cannot be crawled by search engines due to the robots.txt protocol. If you're seeking to remove this page from search results, we recommend that you use meta robots (with noindex, follow values) instead of robots.txt. This will ensure that the page does not appear in the results but allows link juice to flow through the page's links and count towards the relevance/popularity of other pages on your site.
How to block DotBot from crawling your site
If you don't want dotbot crawling your site, we always respect the standard Robots Exclusion Protocol (aka robots.txt). If you would like to block dotbot, all you need to do is add our user-agent string to your robots.txt file.
If you want to ban dotbot from most areas of your site, it looks a little something like this:
User-agent: dotbot Disallow: /admin/ Disallow: /scripts/ Disallow: /images/
below this I have placed what somebody has created that they state works I do not know if it works I told you that distill networks will work but I cannot guarantee the very bottom I think you will not have any trouble if you set up the robots.txt as configured at the top.
If you want to ban dotbot from crawling any part of your site, add this text instead:
User-agent: dotbot Disallow: /
BEGIN
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^rogerbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^exabot [OR]
RewriteCond %{HTTP_USER_AGENT} ^MJ12bot [OR]
RewriteCond %{HTTP_USER_AGENT} ^dotbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^gigabot [OR]
RewriteCond %{HTTP_USER_AGENT} ^AhrefsBot
RewriteRule .* – [F]SetEnvIfNoCase User-Agent .rogerbot. bad_bot
SetEnvIfNoCase User-Agent .exabot. bad_bot
SetEnvIfNoCase User-Agent .mj12bot. bad_bot
SetEnvIfNoCase User-Agent .dotbot. bad_bot
SetEnvIfNoCase User-Agent .gigabot. bad_bot
SetEnvIfNoCase User-Agent .ahrefsbot. bad_bot
SetEnvIfNoCase User-Agent .sitebot. bad_botOrder Allow,Deny
Allow from all
DenyEND
Thomas
-
Thank you for your response but this doesn't answer my question. We aren't blocking rogerbot using robots.txt. We need to allow it through the proxy in front of our web server by using the exact user-agent (case sensitive) that is being sent by rogerbot. We've tried "rogerbot" but that isn't working. Based on the 3rd party documentation you linked to there seem to be a variety of possibilities:
roberbot/1.0
roberBot/1.0
RogerBot/1.0
rogerBot
RogerBotIt would be great if Moz provided clear documentation on this.
-
HiI know there are two crawlers that Moz uses Roger bot and open site Explorer uses dotbot
Make sure there is no forward slash "/" after e.g. (Disallow:/)
Mozused to have an awesome writeup on it but it just forwards to Moz.com/help now it could be that they have another great write up but the URL changed. For now here's the information
User-agent: rogerbot
Disallow:
User-agent: dotbot
Disallow:User Agent Analyser
Mozilla/5.0 (compatible; DotBot/1.1; http://www.opensiteexplorer.org/dotbot, help@moz.com)http://www.useragentstring.com/Dotbot1.1_id_16014.php
https://udger.com/resources/ua-list/bot-detail?bot=rogerbot
http://www.botopedia.org/user-agent-list/crawlers/item/369-rogerbot-seomoz
hope this helps,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved How can I shorten a url?
I've got way too many long url's but I have no idea how to shorten them?
Getting Started | | laurentjb0 -
Unsolved Crawler was not able to access the robots.txt
I'm trying to setup a campaign for jessicamoraninteriors.com and I keep getting messages that Moz can't crawl the site because it can't access the robots.txt. Not sure why, other crawlers don't seem to have a problem and I can access the robots.txt file from my browser. For some additional info, it's a SquareSpace site and my DNS is handled through Cloudflare. Here's the contents of my robots.txt file: # Squarespace Robots Txt User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: FacebookBot User-agent: Claude-Web User-agent: cohere-ai User-agent: PerplexityBot User-agent: Applebot-Extended User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps User-agent: * Disallow: /config Disallow: /search Disallow: /account$ Disallow: /account/ Disallow: /commerce/digital-download/ Disallow: /api/ Allow: /api/ui-extensions/ Disallow: /static/ Disallow:/*?author=* Disallow:/*&author=* Disallow:/*?tag=* Disallow:/*&tag=* Disallow:/*?month=* Disallow:/*&month=* Disallow:/*?view=* Disallow:/*&view=* Disallow:/*?format=json Disallow:/*&format=json Disallow:/*?format=page-context Disallow:/*&format=page-context Disallow:/*?format=main-content Disallow:/*&format=main-content Disallow:/*?format=json-pretty Disallow:/*&format=json-pretty Disallow:/*?format=ical Disallow:/*&format=ical Disallow:/*?reversePaginate=* Disallow:/*&reversePaginate=* Any ideas?
Getting Started | | andrewrench0 -
Unsolved Website Traffic
Greetings All. I'm working on a new business website for a client, and I've been accessing the site numerous times daily (troubleshooting, confirming changes, etc.). I've been using Google search to access the site, and I use a VPN so that my IP would be random. So I would presume that the site traffic should be increasing. But on the last Moz Pro crawl, the site traffic was still listed as 0.
Getting Started | | depawl52
Is there a minimum amount of traffic required before Moz recognizes it or is something else going on?
Thank you.0 -
Unsolved Domain Overview - no NZ?
Why on Domain Overview are there only 4 countries to choose from? I am in NZ and need to check websites that operate businesses in NZ not Canada, US or Australia?
Getting Started | | KapitiGirl0 -
Using the free domain analysis tool - what would cause "Bummer no data found"
When I enter my domain in the free analysis tool, I get a "bummer, no data found". I am certain whatever is causing that to happen is causing other SEO problems https://academicanv.org
Getting Started | | verdet32323 -
Moz not able to crawl our site - any advice?
When I try and crawl our site through Moz it gives this message: Moz was unable to crawl your site on Aug 7, 2019. Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. I have been through all the help and doesn't seem to be any issues. You can check the site and robots.txt here: https://myfamilyclub.co.uk/robots.txt. Anyone got any advice on where I could go to get this sorted?
Getting Started | | MyFamilClubLtd1 -
Has anyone purchased the MOZ SEO courses? Are they good?
I'm looking to learn more about SEO. Has anyone purchased the courses available through MOZ? If so, are they useful?
Getting Started | | Stevepair0