Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
520 Error from crawl report with Cloudflare
-
I am getting a lot of 520 Server Error in crawl reports. I see this is related to Cloudflare. We know 520 is Cloudflare so maybe the Moz team can change this from "unknown" to "Cloudflare 520". Perhaps the Moz team can update the "how to fix" section in the reporting, if they have some possible suggestions on how to avoid seeing these in the report of if there is a real issue that needs to be addressed. At this point I don't know.
There must be a solution that Moz can provide like a setting in Cloudflare that will permit the Rogerbot if Cloudflare is blocking it because it does not like its behavior or something.
It could be that Rogerbot is crawling my site on a bad day or at a time when we were deploying a massive site change. If I know when my site will be down can I pause Rogerbot?
I found this https://developers.cloudflare.com/support/troubleshooting/general-troubleshooting/troubleshooting-crawl-errors/
-
A 520 error is an HTTP error code that indicates that Cloudflare was unable to establish a connection to the origin server. This can happen for a variety of reasons, including:
Server downtime: The origin server might be down or undergoing maintenance.
Firewall restrictions: The origin server might have a firewall that is blocking requests from Cloudflare.
DNS issues: There might be a DNS misconfiguration that is preventing Cloudflare from resolving the origin server's IP address.
SSL issues: There might be an issue with the SSL certificate on the origin server.
To troubleshoot the issue, you can try the following:
Check if the origin server is up and running.
Check if the origin server has a firewall that is blocking requests from Cloudflare.
Check if the DNS is configured correctly.
Check if the SSL certificate is valid and configured correctly.
If none of these steps resolve the issue, you can reach out to Cloudflare support for further assistance.
-
@awilliams_kingston To answer your question, there is no option to pause Rogerbot manually. However, Rogerbot only crawls a website when a Site Crawl campaign is active and scheduled to run. If you want to pause Rogerbot, you can stop the active campaign or schedule the next crawl to start at a later time.
To schedule a Site Crawl, go to your Moz Pro account, click on "Site Crawl" in the left-hand navigation menu, and select "Add Campaign" to set up a new campaign or select an existing one. From there, you can customize your crawl settings, including the crawl frequency and start time.
If you have a scheduled maintenance window and want to prevent Rogerbot from crawling your site during that time, you can adjust the crawl frequency to avoid overlapping with your maintenance schedule. You can also use a robots.txt file to block the crawler from accessing specific pages or sections of your site.
-
@awilliams_kingston The 520 server error you're seeing in your Moz crawl reports is related to Cloudflare. It's a generic error, which means it could be caused by a variety of issues, including server overload or misconfigured settings.
To address this, you could check your Cloudflare firewall settings and see if there are any rules that are blocking the Moz Rogerbot crawler. If there are, try adding an exception for the Rogerbot user agent to allow it to crawl your site without being blocked.
If you know your site will be down for maintenance or undergoing significant changes, you could pause the Moz crawler during that time to prevent it from generating false 520 errors in your reports.
Finally, you could check out the troubleshooting guide in the Cloudflare documentation for more information on identifying and addressing crawl errors. Remember to work with both Moz and Cloudflare support teams to find a solution that works for your specific setup.
-
@Kateparish Thank you.
How do you pause Rogerbot? I can't find anything on that in my admin panel but maybe it is because there is no crawl happening at the moment and my next crawl is scheduled to happen in a few days. Also, is there a way to schedule a pause if a crawl is happening? If I know I have site maintenance on a certain day of the week a specific time, for example, I can have Rogerbot take a break? -
A 520 error typically indicates a connection error between Cloudflare and the origin server. This error occurs when the server returns an empty or invalid response to Cloudflare, or when the server takes too long to respond.
To troubleshoot a 520 error from a crawl report with Cloudflare, you can take the following steps:
Check the server logs: The first step in troubleshooting a 520 error is to check the server logs for any error messages. Look for any errors related to the server's network or connectivity, such as DNS resolution issues, network timeouts, or firewall restrictions.
Check Cloudflare logs: Cloudflare logs can provide additional insights into the cause of the error. Check the Cloudflare logs for any error messages or connection issues between Cloudflare and the origin server.
Temporarily disable Cloudflare: Temporarily disabling Cloudflare can help you determine if the error is caused by Cloudflare or the origin server. If the error disappears when Cloudflare is disabled, then the issue is likely with Cloudflare.
Contact Cloudflare support: If you are unable to resolve the issue on your own, you can contact Cloudflare support for assistance. Provide them with the server logs and Cloudflare logs, as well as any other relevant information, to help them diagnose the issue.
By following these steps, you should be able to identify and resolve the 520 error from the crawl report with Cloudflare.
-
@awilliams_kingston The 520 server error you're seeing in your Moz crawl reports is related to Cloudflare. It's a generic error, which means it could be caused by a variety of issues, including server overload or misconfigured settings.
To address this, you could check your Cloudflare firewall settings and see if there are any rules that are blocking the Moz Rogerbot crawler. If there are, try adding an exception for the Rogerbot user agent to allow it to crawl your site without being blocked.
If you know your site will be down for maintenance or undergoing significant changes, you could pause the Moz crawler during that time to prevent it from generating false 520 errors in your reports.
Finally, you could check out the troubleshooting guide in the Cloudflare documentation for more information on identifying and addressing crawl errors. Remember to work with both Moz and Cloudflare support teams to find a solution that works for your specific setup.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rogerbot directives in robots.txt
I feel like I spend a lot of time setting false positives in my reports to ignore. Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
Reporting & Analytics | | awilliams_kingston0 -
How to get rid of bot verification errors
I have a client who sells highly technical products and has lots and lots (a couple of hundred) pdf datasheets that can be downloaded from their website. But in order to download a datasheet, a user has to register on the site. Once they are registered, they can download whatever they want (I know this isn't a good idea but this wasn't set up by us and is historical). On doing a Moz crawl of the site, it came up with a couple of hundred 401 errors. When I investigated, they are all pages where there is a button to click through to get one of these downloads. The Moz error report calls the error "Bot verification". My questions are:
Technical SEO | | mfrgolfgti
Are these really errors?
If so, what can I do to fix them?
If not, can I just tell Moz to ignore them or will this cause bigger problems?0 -
Any crawl issues with TLS 1.3?
Not a techie here...maybe this is to be expected, but ever since one of my client sites has switched to TLS 1.3, I've had a couple of crawl issues and other hiccups. First, I noticed that I can't use HTTPSTATUS.io any more...it renders an error message for URLs on the site in question. I wrote to their support desk and they said they haven't updated to 1.3 yet. Bummer, because I loved httpstatus.io's functionality, esp. getting bulk reports. Also, my Moz campaign crawls were failing. We are setting up a robots.txt directive to allow rogerbot (and the other bot), and will see if that works. These fails are consistent with the date we switched to 1.3, and some testing confirmed it. Anyone else seeing these types of issues, and can suggest any workarounds, solves, hacks to make my life easier? (including an alternative to httpstatus.io...I have and use screaming frog...not as slick, I'm afraid!) Do you think there was a configuration error with the client's TLS 1.3 upgrade, or maybe they're using a problematic/older version of 1.3?? Thanks -
Technical SEO | | TimDickey0 -
:443 - 404 error
I get strange :443 errors in my 404 monitor on Wordpress https://www.compleetverkleed.nl:443/hoed-al-capone-panter-8713647758068-2/
Technical SEO | | Happy-SEO
https://www.compleetverkleed.nl:443/cart/www.compleetverkleed.nl/feestkleding
https://www.compleetverkleed.nl:443/maskers/ I have no idea where these come from :S2 -
Google stopped crawling my site. Everybody is stumped.
This has stumped the Wordpress staff and people in the Google Webmasters forum. We are in Google News (have been for years), and so new posts are crawled immediately. On Feb 17-18 Crawl Stats dropped 85%, and new posts were no longer indexed (not appearing on News or search). Data highlighter attempts return "This URL could not be found in Google's index." No manual actions by Google. No changes to the website; no custom CSS. No Site Errors or new URL errors. No sitemap problems (resubmitting didn't help). We're on wordpress.com, so no odd code. We can see the robot.txt file. Other search engines can see us, as can social media websites. Older posts still index, but loss of News is a big hit. Also, I think overall Google referrals are dropping. We can Fetch the URL for a new post, and many hours later it appears on Google and News, and we can then use Data Highlighter. It's now 6 days and no recovery. Everybody is stumped. Any ideas? I just joined, so this might be the wrong venue. If so, apologies.
Technical SEO | | Editor-FabiusMaximus_Website0 -
403 forbidden error how to solve them
hi, i have been using a great tool today called screaming frog which was shown to me by Thomas Zickell when i used the tool i found some worrying things for my site www.in2town.co.uk. what i have found is, i have a large number of 403 forbidden status on my home page and i do not know why here is an example http://www.in2town.co.uk/emmerdale/emmerdale-debbie-hits-rock-bottom it loads fine but on the tool it shows it as an error and shows it as having no meta tags or anything but there is meta tags in there can anyone please let me know how to solve this and why it has happened many thanks
Technical SEO | | ClaireH-1848860 -
Can too many pages hurt crawling and ranking?
Hi, I work for local yellow pages in Belgium, over the last months we introduced a succesfull technique to boost SEO traffic: we have created over 150k of new pages, all targeting specific keywords and all containing unique content, a site architecture to enable google to find these pages through crawling, xml sitemaps, .... All signs (traffic, indexation of xml sitemaps, rankings, ...) are positive. So far so good. We are able to quickly build more unique pages, and I wonder how google will react to this type of "large scale operation": can it hurt crawling and ranking if google notices big volumes of content (unique content)? Please advice
Technical SEO | | TruvoDirectories0 -
404 Errors After Site Migration
Hello - I'm working on a website selling fashion accessories. The site just went through a site migration from Yahoo! to Big Commerce. Now we have a high level of warnings and errors from the crawl. Few are mentioning sites I never seen before on the Yahoo! platform. I also notice that the pages crawled has doubled. How can I fix or did I do something wrong with migration? I was running the website with minimal errors and now overwhelmed with errors all the error updates. If I can get some assistance on what could be wrong, I would greatly appreciate. Thanks.
Technical SEO | | ShopChameleon0