Why RogerBot can't crawl site https://unplag.com
-
Hello
Please help me to solve the problem.
The on-page grader and Crawl Test are not working for Unplag.com website. Both said that they can't access the url. Yes, I've tried different variants like unplag.com, http://unplag.com
One more thing - RogerBot was disallowed in robots.txt file. I deleted it from the file a week ago so maybe moz index haven't been renewed.
-
Thank you. I'll try to solve the problem
-
The trouble is not with your robot.txt - in the server config you block rogerbot completely and serve a 400 for each request it makes..
If you have a user agent switcher plugin in your browser & change the user agent to rogerbot (rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com) - the server returns a 400 Bad Request.
Dirk
-
The logs are like this:
"GET / HTTP/1.0" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
and of course sometimes rogerbot is trying to see the robots file:
"GET /robots.txt HTTP/1.1" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
for me it looks like the rogerbot is disallowed in robots.txt but the file is like this https://unplag.com/robots.txt
-
thanks a lot!
-
Follow the advice from Jordan below and try to check your log files to see what the server response is when Rogerbot is trying to visit the site.
I noticed some DNS issues with your site - check http://dnscheck.pingdom.com/?domain=unplag.com - Nameservers don't seem to be ok. Also noticed that you have a 302 redirect from http -> https - while this should be 301. Probably not related to your main issue but worth checking.
-
Thanks.
The last crawl was after the robots.txt change.
And I don't see any errors in the dashboard.
-
After creating a fresh test campaign for the site, I'm still seeing a 400 response being served to rogerbot from https://unplag.com/. While I'm not able to pinpoint the exact setting that is causing the site to serve that response, I'd recommend checking your server logs to verify the response that is being served.
-
It's possible that your site hasn't been crawled yet (since you changed the robots.txt). You can see in your campaign dashboard (upper right corner) when the next crawl is scheduled.
Do you see any specific error codes on your dashboard?
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is on-demand crawl reporting numbers that are very different than the regualr site crawl by Moz?
When I look at the site crawl available through my dashboard, it reports 22.9K pages crawled. I just ran a on-demand crawl and it is only reporting 6K pages crawled. Why is the number so much lower?
Moz Bar | | ctripp10100 -
What data we don't get from link explorer that we can get if we add a campaign?
I was wondering what's the difference between campaign data and link explorer data, both in pro version of moz? What are the features we get by adding campaign that we don't get via link explorer?
Moz Bar | | HuptechWebseo0 -
Is the update site crawl feature following robot.txt rules?
I noticed that most of the errors would not be occurring if Moz's tool followed the rules implemented in sites robots.txt. Has anyone else seen this problem and do you know if Moz will fix this?
Moz Bar | | jamestown0 -
Site Crawl - MOz Pro
Hi There, When i look into my site crawl i have thousands of duplicate content issues. Now they are essentially product pages which are in multiple categories - however we have added the canonical tag so im confused as to why all of these are appearing as if there is an error, does the MOZ bot not take canonicals into account? Kind Regards Gemma
Moz Bar | | acsilver0 -
Rogerbot will not crawl my site! Site URL is https but keep getting and error that homepage (http) can not be accessed. I set up a second campaign to alter the target url to the newer https version but still getting the same error! What can I do?
Site URL is https but keep getting and error that homepage (http://www.flogas.co.uk/) can not be accessed. I set up a second campaign to alter the target url to the newer https://www.flogas.co.uk/ version but still getting the same error! What can I do? I want to use Moz for everything rather than continuing to use a separate auditing tool!
Moz Bar | | digitalascend0 -
Can we export Historical Campaign Data
I want to know if there is a way to export data from my campaigns from more than a month ago. My campaigns started in August. I would like to be able to export a PDF of my dashboard for Aug and September. I have Oct and Nov. Any idea if this is possible?
Moz Bar | | MonicaOConnor0 -
My product pages have no weight / links from root domains with the Moz tool bar
Hi, When I view my home page (http://www.arkwildlife.co.uk) with the Moz toolbar, I see a good PA and a good amount of links from root domains. As I go down the site, it seems to get worse. The category pages (http://www.arkwildlife.co.uk/Category/0/Straight_Foods.html) have a little but not much and then from this point onwards, it's nothing. The product page (http://www.arkwildlife.co.uk/Item/Straight_Foods~Sunflower_Seeds/SUNH/Premium_Sunflower_Hearts.html) is reporting to have no root domain links but I am not sure why. Interestingly, when I click through to a review page (http://www.arkwildlife.co.uk/StockReview/0/SUNH/0/Premium_Sunflower_Hearts.html) it does have some juice. Would anyone be able help on why this is happening and what I need to be looking at in order to resolve it please? EDIT: I've been looking at the hyperlinks and notice something odd. If I review the score with the first link below, it gives a score of 1, but the second gives a PA of 13 with one root domain linked. 1:http://www.arkwildlife.co.uk/Item/Straight_Foods~Sunflower_Seeds/SUNH/Premium_Sunflower_Hearts.html 2:http://www.arkwildlife.co.uk/Item/Straight_Foods%7ESunflower_Seeds/SUNH/Premium_Sunflower_Hearts.html Please note the "%7E" instead of the "~" in the URL. The browser appears to show the ~ character no matter what but the rank of the page changes. I don't understand what the Moz toolbar is doing with this. Note: This behaviour only happens in Mozilla Firefox, in chrome both the rankings are zero for each URL. Many Thanks.
Moz Bar | | nawgie0 -
Dupe content report showing in 'Errors' section when surely should be in 'Warnings' section ?
Why is the dupe content info showing in errors and not warnings ? Since if dupe content can get your site penalised (as per Panda) or worse banned, surely it should be in that section of reports ? Cheers
Moz Bar | | Dan-Lawrence
Dan0