Why RogerBot can't crawl site https://unplag.com
-
Hello
Please help me to solve the problem.
The on-page grader and Crawl Test are not working for Unplag.com website. Both said that they can't access the url. Yes, I've tried different variants like unplag.com, http://unplag.com
One more thing - RogerBot was disallowed in robots.txt file. I deleted it from the file a week ago so maybe moz index haven't been renewed.
-
Thank you. I'll try to solve the problem
-
The trouble is not with your robot.txt - in the server config you block rogerbot completely and serve a 400 for each request it makes..
If you have a user agent switcher plugin in your browser & change the user agent to rogerbot (rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com) - the server returns a 400 Bad Request.
Dirk
-
The logs are like this:
"GET / HTTP/1.0" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
and of course sometimes rogerbot is trying to see the robots file:
"GET /robots.txt HTTP/1.1" 400 166 "-" "rogerbot/1.1 (http://moz.com/help/guides/search-overview/crawl-diagnostics#more-help, rogerbot-crawler+pr2-crawler-101@moz.com)" "-" - "https"
for me it looks like the rogerbot is disallowed in robots.txt but the file is like this https://unplag.com/robots.txt
-
thanks a lot!
-
Follow the advice from Jordan below and try to check your log files to see what the server response is when Rogerbot is trying to visit the site.
I noticed some DNS issues with your site - check http://dnscheck.pingdom.com/?domain=unplag.com - Nameservers don't seem to be ok. Also noticed that you have a 302 redirect from http -> https - while this should be 301. Probably not related to your main issue but worth checking.
-
Thanks.
The last crawl was after the robots.txt change.
And I don't see any errors in the dashboard.
-
After creating a fresh test campaign for the site, I'm still seeing a 400 response being served to rogerbot from https://unplag.com/. While I'm not able to pinpoint the exact setting that is causing the site to serve that response, I'd recommend checking your server logs to verify the response that is being served.
-
It's possible that your site hasn't been crawled yet (since you changed the robots.txt). You can see in your campaign dashboard (upper right corner) when the next crawl is scheduled.
Do you see any specific error codes on your dashboard?
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Www and non www / duplicate content / redirects / www resolve issue
I am not getting docked for these specific errors, but I am getting docked for 1 page has a WWW resolve issue and 1 wrong URL in the sitemap... (SEM Rush) but when I use moz, it's not showing any issues. So I have these things set up so far: In .htaccess i have a command that removes the www. 301 redirect from www version to the non www (homepage) canonical on index.html pointing to non www version, I also set up a canonical tag for each page on the site search console with non www, www, https www, https non www all set to non www preference. Also, when I fetch the www version in google search console it says it's being 301 redirected to non www version which is basically what I want.Is there anything that i'm missing? These errors on SEM Rush are giving me anxiety lol.
Moz Bar | | donnieath1 -
How do I cancel a crawl request?
I was farting around and exploring the Crawl Test tool, and accidentally sent out a crawl for a competitor's site (I wanted to see if the tool would decline to crawl without verification). I do NOT want to actually crawl that site, nor do I want the competitor to see that we requested it (for obvious reasons) - how do I cancel it?
Moz Bar | | mkbeesto0 -
Moz crawler only crawls one page?!
Hello there, I'm using Moz for a while and I'm very pleased with the tool and community. But for the first time I encountered a problem. We are trying to run a crawler for a client's website but only one page (only the homepage) was crawled. We tried to do a test on a more detailed level (maybe there is something wrong with the homepage). My campaign test's crawl came back for the Producten folder (level deeper than homepage), and it was also only a 1 page crawl with a 200 status. I did look at the robots.txt file now, and it is very restrictive, but there is nothing that I can clearly see that would explain why the crawl isn't working. Hopefully someone can point us at the right direction. Thanks in advance, Jeremy
Moz Bar | | mediaxplain.nl0 -
How Do I Troubleshoot 804 HTTPS Crawl Error?
In my Moz crawl report I get: Crawl Error
Moz Bar | | digium
Moz encountered an error on one or more pages on your site
Error Code 804: HTTPS (SSL) Error Encountered The Moz Help Section only says: 804 HTTPS (SSL) error 804 errors result from a site with misconfigured SSL software. If Moz's crawlers cannot correctly interpret an SSL response for a home page, the crawl ends immediately. My site is publicly accessible on https - https://www.respoke.io/ And I'm not seeing any issues with my certificate. Can anyone help me out? What steps can I take to troubleshoot this error? If SSL is misconfigured, how do I configure it properly?0 -
Domain.com isn't recognized by on-page-grader, but domain.com/index.php is
I am running a website through On-page-grader, as www.domain.com and scores an "F" for a specific keyword. When it's ran as www.domain.com/index.php, it scores an "A" for that same keyword and has everything checked other than "keyword in the domain name". There are no other files such as index.htm, or index.html that would interfere and can't figure out why this page is not being recognized. I checked, the robots and .htaccess file, but do not see anything that would hinder. Could this be a server issue?
Moz Bar | | werkbot0 -
My product pages have no weight / links from root domains with the Moz tool bar
Hi, When I view my home page (http://www.arkwildlife.co.uk) with the Moz toolbar, I see a good PA and a good amount of links from root domains. As I go down the site, it seems to get worse. The category pages (http://www.arkwildlife.co.uk/Category/0/Straight_Foods.html) have a little but not much and then from this point onwards, it's nothing. The product page (http://www.arkwildlife.co.uk/Item/Straight_Foods~Sunflower_Seeds/SUNH/Premium_Sunflower_Hearts.html) is reporting to have no root domain links but I am not sure why. Interestingly, when I click through to a review page (http://www.arkwildlife.co.uk/StockReview/0/SUNH/0/Premium_Sunflower_Hearts.html) it does have some juice. Would anyone be able help on why this is happening and what I need to be looking at in order to resolve it please? EDIT: I've been looking at the hyperlinks and notice something odd. If I review the score with the first link below, it gives a score of 1, but the second gives a PA of 13 with one root domain linked. 1:http://www.arkwildlife.co.uk/Item/Straight_Foods~Sunflower_Seeds/SUNH/Premium_Sunflower_Hearts.html 2:http://www.arkwildlife.co.uk/Item/Straight_Foods%7ESunflower_Seeds/SUNH/Premium_Sunflower_Hearts.html Please note the "%7E" instead of the "~" in the URL. The browser appears to show the ~ character no matter what but the rank of the page changes. I don't understand what the Moz toolbar is doing with this. Note: This behaviour only happens in Mozilla Firefox, in chrome both the rankings are zero for each URL. Many Thanks.
Moz Bar | | nawgie0 -
Can you track amount of visitors to competitive websites/blogs?
We are trying to understand our website traffic, specifically our blog, compared to our competitors. Can we use Moz to understand actual visitor metrics against competitive domains?
Moz Bar | | AppNeta0 -
Moz Crawl Test: Referrer is sitemap.gz?
Hi,
Moz Bar | | Titan552
I'm looking at a crawl test report, and I'm seeing that most of the pages have the sitemamp.gz file listed as the referrer. As I recall in my other reports the referrer is usually the root domain - unless of course there's a redirect. Does having sitemap.gz as the referrer indicate a problem? If so, what problem does it indicate? Thanks!0