Moz "Crawl Diagnostics" doesn't respect robots.txt
-
Hello, I've just had a new website crawled by the Moz bot. It's come back with thousands of errors saying things like:
- Duplicate content
- Overly dynamic URLs
- Duplicate Page Titles
The duplicate content & URLs it's found are all blocked in the robots.txt so why am I seeing these errors?
Here's an example of some of the robots.txt that blocks things like dynamic URLs and directories (which Moz bot ignored):Disallow: /?mode=
Disallow: /?limit=
Disallow: /?dir=
Disallow: /?p=*&
Disallow: /?SID=
Disallow: /reviews/
Disallow: /home/Many thanks for any info on this issue.
-
Hi Si, has this issue been resolved?
-
Hey Si,
Thanks for writing in. It doesn't seem that we are having an overarching issue with our crawler ignoring robots.txt files so I did some research in Google Webmaster Tools and it looks like most crawlers require an asterisk in the disallow directive to recognize that all pages of a dynamic URL are being disallowed. If you look in the "Pattern Matching" section of this resource here: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449, that should give you more information about setting up the robots.txt with the correct disallow directives to block those pages.
If you add in the astrisk to the disallow directive and you are still seeing these pages crawled, it would help if you sent in an email with your campaign information to our support desk at help@moz.com so we can have our engineers look into this more directly.
I hope this helps.
Chiaryn
-
If you have an "index,(no)follow" meta on those pages I think they will be crawled even though you have them blocked in robots.txt. So by adding "noindex" on those pages it might work as you want it to.
-
Is the / actually in the URL at that spot? Or is your link like http://www.example.com/abcd?p=147
If you give an example full URL that includes one of your blocked dynamic URLs we can take a better look. If your robots is setup correctly, it shouldn't find that stuff but give us more info if you're able.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz on-page grader can't see keyword in H1
Hi, Trying to do grade for whatiswhere.com, keyword 'poi search'. The tool can't see keyword in h1, but It is there. Could you please check what is the problem? Thank you, Andrei.
Moz Bar | | Anazar_20010 -
Moz Keyword Tool Monthly Volume
Ive recently put together a Keyword List of about 100 keywords on the Moz Keyword Explorer tool. One keyword, aerial filming, stood out as very low search volume of 51 - 100. I took the same 100 keywords and passed them through the Google Keyword Planner by Google AdWords. Aerial Filming has an average search volume of 1k - 10k according to the Keyword Planner. Even though Keyword Planner gives me a range of 1k - 10k, the lowest number is still 10 times higher than what the Moz Keyword Explorer was indicating. This drastic difference of volume was consistent across all 100 keywords. All of the Monthly Volume numbers were divided by 10. Why does Moz Keyword Tool display a search volume that is 10x less than what Google Keyword Planner is suggesting?
Moz Bar | | fictionarts0 -
804 : HTTPS (SSL) Error in Crawl Test
So I am getting this 804 Error but I have checked our Security Certificate and it looks to be just fine. In fact we have another 156 days before renewal on it. We did have some issues with this a couple months ago but it has been fixed. Now, there is a 301 from http to https and I did not start the crawl on https so I am curious if that is the issue? Just wanted to know if anybody else has seen this and if you were able to remedy it? Thanks,
Moz Bar | | DRSearchEngOpt
Chris Birkholm0 -
My crawl report only shows 1 link
Hello, I've tried a crawl for the site www.doctify.co.uk and it's only returned 1 link in the report which is the homepage. Do you know what the issue could be? Thanks, Nina
Moz Bar | | Global_Blue0 -
Do exact keyword matches exclude "in", "based" etc?
I am trying to build a landing page for the search term "web design london" and I have included this search term as well as some variations such as "web design in london", "web design based in london" as the content doesn't really read well if I don't put in a connector word (I can't remember what the term for the use of "in" etc is). However I am using the Moz On-Page Grader to make sure I'm dotting every i and crossing every t, but it doesn't seem to pick up on the search term when "in" or "based" is used. Now is this a limitation of the On-Page Grader or should I expect Google and other search engines to not pick up on the search term when it contains these sorts of words?
Moz Bar | | mickburkesnr0 -
Moz reporting for C-Blocking
Hey Mozers, I see Moz has a reporting tool for C-blocking and for november I had 330. Does this mean 330 Ip addresses came from the same location in the month of november?
Moz Bar | | rpaiva1 -
Moz Report: Number of Domains linking to Domain
In the Moz report, what does the "number of domains linking to domain" indicate? I'm analyzing our competitors link profile and noticed some numbers as high as 2369232? Is there a form that explains the reports a little more in detail? Thanks,
Moz Bar | | WebRiverGroup0