RogerBot does not respect some rules??
-
Hello;
Every week when I see my stats I notice that RogerBot has crawled 10000 form my website, even pages with a no index or not allowed in the robots.txt.
Is it possible to avoid him from crawling the these pages? They are form pages in my site, with are not indexed by google, they have a noindex and they are not allowed for crawling in the robots.txt.
Thanks everyone for your help!!!
-
If Roger is still not listening to you, send an email to help@seomoz.org and open a ticket with the help desk. They'll try to figure out why he's misbehaving and how to get him to listen to you again.
-
Hi Jorge,
Yes this would be possible, Rogerbot is also the User Agent for the crawler. So within you're robots.txt you are capable of letting Roger know which pages you don't like him to crawl. More information about this could be found on this page about Roger himself.
Hopefully this answers your question.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rogerbot crawls my site and causes error as it uses urls that don't exist
Whenever the rogerbot comes back to my site for a crawl it seems to want to crawl urls that dont exist and thus causes errors to be reported... Example:- The correct url is as follows: /vw-baywindow/cab_door_slide_door_tailgate_engine_lid_parts/cab_door_seals/genuine_vw_brazil_cab_door_rubber_68-79_10330/ But it seems to want to crawl the following: /vw-baywindow/cab_door_slide_door_tailgate_engine_lid_parts/cab_door_seals/genuine_vw_brazil_cab_door_rubber_68-79_10330/?id=10330 This format doesn't exist anywhere and never has so I have no idea where its getting this url format from The user agent details I get are as follows: IP ADDRESS: 107.22.107.114
Moz Pro | | spiralsites
USER AGENT: rogerbot/1.0 (http://moz.com/help/pro/what-is-rogerbot-, rogerbot-crawler+pr1-crawler-17@moz.com)0 -
RegEx as rule condition in brand rules possible?
Hi everybody, can i use a RegEx expression as a rule condition to tag branded/non-branded keywords? For example.. I want to tag all keywords beginning with "This" to be tagged as branded. ThisHouse
Moz Pro | | ValerieSchmidt
ThisPhone
ThisPC Seems like the brand rule is a exact match but that doesn't work for this kind of keywords, right? Thank you Regards0 -
HTC access 301 redirect rules regarding pagination and striped category base (wp)
I am an admin of a wordpress.org blog and I used to use "Yoast All in one SEO" plugin. While I was using this plugin it stripped the category base from my blog post URL's. With yoast all in one seo: Site.com/topic/subtpoic/page/#
Moz Pro | | notgwenevere
Without yoast all in one seo: Site.com/category/topic/subtopic/page/# Now, that I have switched to another plugin, I am trying to manage the page crawl errors which are tremendous somewhere around 1800, mostly due to pagination. Rather than redirecting each URL individually I would like to develop HTC access 301 redirects rules. However all instructions on how to create these HTC access 301 redirect rules are regarding the suffix rather than the category base. So my question is, can HTC access 301 redirects rules work to fix this problem? Including pagination? And if so, what would this particular HTC access 301 redirect look like? Especially regarding pagination? And do I really have to write a 301 redirect for each pagination page?0 -
You've recently updated your brand rules. We're fetching your new data, and we should have it ready for you within the hour.
Why do i always see this message when entering a certain campaign? "You've recently updated your brand rules. We're fetching your new data, and we should have it ready for you within the hour." I didnt change a thing since i started this campaign two-three weeks ago ...
Moz Pro | | alsvik0 -
Rogerbot not showing in logs
Hi All Rogerbot has recently thrown up 403 errors for all our pages - no changes had been made to the site so I asked our ISP for assistance. They wanted to have a look at what rogerbot was doing and so went to the logs but rogerbot was not listed anywhere in the logs by name - any ideas why? Regards Craig
Moz Pro | | CraigWiltshire0 -
What is the full User Agent of Rogerbot?
What's the exact string that Rogerbot send out as his UserAgent within the HTTP Request? Does it ever differ?
Moz Pro | | rightmove0 -
Is there a whitelist of the RogerBot IP Addresses?
I'm all for letting Roger crawl my site, but it's not uncommon for malicious spiders to spoof the User-Agent string. Having a whitelist of Roger's IP addresses would be immensely useful!
Moz Pro | | EricCholis1