Different Errors Running 2 Crawls on Effectively the Same Setup
-
Our developers are moving away from utilising robots.txt files due to security risks, so e have been in the process of removing them from sites. However we, and our clients still want to run Moz crawl reports as they can highlight useful information.
The two sites in question sit on the same server with the same settings (in fact running on the same Magento install). We do not have a robots.txt files present (they 404), and as per Chiaryn's response here https://moz.com/community/q/without-robots-txt-no-crawling this should work fine?
However for www.iconiclights.co.uk we got: 902 : Network errors prevented crawler from contacting server for page.
While for www.valuelights.co.uk we got: 612 : Page banned by error response for robots.txt.
These crawls were both run recently, and there was no robots.txt present. Not to mention, they are on the same setup/server etc as mentioned. Now, we have just tested this, by uploading a blank robots.txt file to see if it changed anything - but we get exactly the same errors.
I have had a look, but can't find anything that really matches this on here - help would really be appreciated!
Thanks!
-
Hey there! Tawny from the Customer Support team here!
This sounds like a juicy issue, and one I'd love to dive in and help you with! Unfortunately, without being able to take a look at your campaigns and account directly, it's tough to provide specific support for these issues.
That said, if you write in to help@moz.com and give us the details of what you're seeing - basically exactly what's in this question - we should be able to help investigate for you.
-
Having no Robots.txt, or a blank one, is perfectly fine (though honestly its no more a security risk than your Sitemap.xml). But your current issue is that both of your sites are returning 403 status codes at crawlers while people are still able to land on your pages. This has nothing to do with the Robots.txt file being changed or removed; just an odd coincidence. This most likely is an issue in htaccess file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why do i get multiple variations of my url with ?order=asc and ?view=list at the end of it in my crawl report?
I just did a crawl for one my clients to validate any error in the structure. Next thing I know is that the website have multiple variation of the same url with query like ?order=asc and ?view=list at the end of it. I am wondering why these url variations appears in the crawl I just did since bots aren't suppose to go further thant the ? normally. Just to show you a couple of url's of my crawl test. <colgroup><col width="484"></colgroup>
Moz Bar | | alexrbrg
| https://test.com/exemple/?per_page=9 |
| https://test.com/exemple/?per_page=15 |
| https://test.com/exemple/?per_page=30 |
| https://test.com/exemple/?orderby=popularity |
| https://test.com/exemple/?orderby=date |
| https://test.com/exemple/?orderby=price |
| https://test.com/exemple/?orderby=price-desc |
| https://test.com/exemple/?order=asc |
| https://test.com/exemple/?order=desc |
| https://test.com/exemple/?view=list | Thank you Guys0 -
MOZ crawler 404 errors on wordpress
Hi all, I've got hundreds of issues coming up on the MOZ crawler with 404 errors, I don't know what these URL's are. Here's a couple of examples; http://www.theswagbagco.co.uk/category/watford/http%3A%2F%2Fwww.theswagbagco.co.uk%2F2015%2F10%2F15%2Fnew-products-2%2F
Moz Bar | | vaineh
http://www.theswagbagco.co.uk/2015/10/01/thank-you-epsom/http%3A%2F%2Fwww.theswagbagco.co.uk%2F2015%2F10%2F01%2Fthank-you-epsom%2F See the first one is one page with a different url appended, the second is the same thank-you-epsom url. How would I find out where these are even being linked from?0 -
I can't seem to get Moz Crawl to run? Re-bootyourbody.com. Told its a subdomain...What do I do?
I can't seem to get Moz Crawl to run? Re-bootyourbody.com. Told its a subdomain...What do I do?
Moz Bar | | Joseph.Lusso0 -
All of a sudden a number of my key pages are getting 403 errors with Moz?
One of my squarespace sites has suddenly thrown up a number of 403 access denied errors for a range of pages on the site, according to the Moz weekly report. There is no access issues for these pages and nothing has been changes wrt url, etc..... so why the errors (which I am not seeing through SEM rush for example)? Thx, Phil
Moz Bar | | bugdoctor0 -
Crawl Diagnostics - nofollow - reducing duplicate pages
Hi I'm looking at a crawl diagnostic report, I can see I have many duplicate pages, the reason for this is that when a brand filter is applied to a page. IE
Moz Bar | | chameleondm
www.mysite.com/mycategory - lets say this is the product listing page
www.mysite.com/category/mybrand - and this is the same page but with a brand filter applied
www.mysite.com/category/myotherbrand - and this is the same page but with a different brand filter applied I had intially appendeded the meta title, description and keywords with some extra content if a brand filter was applied, because the page on the whole does have different content. IE I would have a custom meta information, H1 tag and products on that page just for that specific brand.
However I am wondering if these two pages are really just competing with each other as lots of the content will be the same. Should I scrap that approach and use either nofollow on the brand filter link, or simply use a canonical. Thanks, James1 -
Moz Dupe content crawl anomaly
Hi Moz has completed a crawl for a site i'm working on which also has a development area (hence with lots of dupe content) on a sub domain (and this dev area hasn't been hidden from crawlers via password, robots, gwt etc etc). Moz dupe content report is not showing any of these urls though even though my campaign setting is on 'root' domain so i would have thought report should be listing the subdomain urls as dupe content (because they are dupe content). Any ideas ? Cheers Dan
Moz Bar | | Dan-Lawrence0 -
Blocked Production Site from Search Engines - How to get it Crawled by Moz Crawler
I have an 'under development' site hosted, (which is an exact replica of live site as working on to add new functionalities & modules) - but its password protected, excluded from robots.txt (Disallow) & also marked noindex on all pages in the index - so that Googlebot & other Search Engines can not crawl the site At present the development work is almost 95% completed., Now - feel like to crawl the site through SEOMOZ Roger Bot - to know the errors and all indexed urls by Rogerbot. What's the best way to get Moz Bot crawl the site - but simultaneously continue it blocking its access to Search Engines I have gone through - https://support.google.com/webmasters/answer/93708?hl=en, it says a) Save it in a password-protected directory. Googlebot and other spiders won't be able to access the content- But this way Moz will also not be able to crawl the site b) Use a robots.txt to control access to files and directories on your server - However it also says - It's important to note that even if you use a robots.txt file to block spiders from crawling content on your site, Google could discover it in other ways and add it to our index. c) Use a noindex meta tag to prevent content from appearing in our search results - It also says that a link to the page can still appear in their search results. Because we have to crawl your page in order to see the noindex tag, there's a small chance that Googlebot won't see and respect the noindex meta tag Password Protected thus seems the best way to continue blocking. However, continuing with it will also block Moz bot to crawl the site. Any suggestions Thanks
Moz Bar | | Modi0 -
404 Crawl Diagnostics Report MOZ
Hi, I keep getting 404's appear in the Crawl Diagnostic error warnings. How do I find out which pages are linking to these 404 pages? How is MOZ finding them? thanks Ben
Moz Bar | | bjs20100