"Issue: Duplicate Page Content " in Crawl Diagnostics - but these pages are noindex
-
Saw an issue back in 2011 about this and I'm experiencing the same issue. http://moz.com/community/q/issue-duplicate-page-content-in-crawl-diagnostics-but-these-pages-are-noindex
We have pages that are meta-tagged as no-everything for bots but are being reported as duplicate. Any suggestions on how to exclude them from the Moz bot?
-
Technically that could be done in your robots.txt file but I wouldn't recommend that if you want Google to crawl them too. I'm not sure if Rogerbot can do that. Sorry I couldn't be more help.
If you don't get one of the staffers on here in the next few days, I would send a ticket to them for clarification.
If you decide to go with robots.txt here is a resource from Google on implementing and testing it. https://support.google.com/webmasters/answer/156449?hl=en
-
Thanks for the information on Rogerbot. I understand the difference between the bots from Google and Moz.
Some errors reported in Moz are not real. For example we use a responsive slider on the home page that generates the slides from specific pages. These pages are tagged to no-everything so as to be invisible to bots, yet they are generating errors in the reports.
Is there anyway to exclude some pages from the reports?
-
Don't forget that Rogerbot (moz's crawler) is a robot and not an index like Google. Google used robots to gather the data but the results we see is an index. Rogerbot will crawl the pages regardless of noindex or nofollow.
Here is more info on RogerBot http://moz.com/help/pro/rogerbot-crawler
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When using long-tail keywords, should you exactly match for the url or delete "in" "to" etc.?
long-tail keyword - "seizures in adults with no history" Should you include "in and with" in the url?
On-Page Optimization | | Moleculera0 -
Is minor duplicate content on my website okay?
I know duplicate content across multiple websites is not a good thing, however I've always wondered about minor duplicate content on your own website. I know its good practice to have unique content on each page but what about the little stuff. For example on our website certain related pages share the same content in a right sidebar. Such as links to pdf leaflets, or "you can read our blog etc" . Is there a minimum number of repeated words required before its flagged as duplicate content? Another example is a customer gave two testimonials for two of our employees - the testimonials were identical other than the employee names - if these were posted on separate pages is it a problem for the site as a whole or for both those individual pages? Thanks
On-Page Optimization | | Brabian0 -
Long list of companies spread out over several pages - duplicate content?
Hi all, I am currently working with a company formation agent. They have a list of every limited company spread over hundreds of pages. What do you guys think? Is there a need for Canonicals? The website is ranking pretty well but I want to make sure there aren't any problems in the future. Here are two pages as examples: http://www.formationsdirect.com/companysearchlist.aspx?start=MULLAGHBOY+CONSTRUCTION+LIMITED&next=1# http://www.formationsdirect.com/companysearchlist.aspx?start=%40a+company+limited&next=1# Also what about the actual company pages? See an example below http://www.formationsdirect.com/companysearchlist.aspx?name=AMNA+CONSTRUCTION+LTD&number=06630333#.U8PW6_ldX1s Thanks in advance Aaron
On-Page Optimization | | AaronGro0 -
Duplicate Page Title issues
Hello, I have a duplicate page title problem: Crawl Diagnostics Reported that my website got **sample URLs with this Duplicate Page Title **between:
On-Page Optimization | | JohnHuynh
http://www.vietnamvisacorp.com/faqs.html and these URLs below:http://www.vietnamvisacorp.com/faqs/page-2
http://www.vietnamvisacorp.com/faqs/page-3
http://www.vietnamvisacorp.com/faqs/page-4
http://www.vietnamvisacorp.com/faqs/page-5 I don't know why, because I have already implemented rel=”next” and rel=”prev” to canonical pages. Please give me an advice!0 -
tagged as duplicate content?
Hello folks, I'm new to SEOmoz . I was looking at our Crawl Diagnostics and found that some of our blog posts that have been commented on were tagged as duplicate content. For example: http://thankyouregistry.com/blog/remarriages-and-gift-registries/ http://thankyouregistry.com/blog/remarriages-and-gift-registries/comment-page-1/ I'm unsure how to fix these, so any ideas would be appreciated. Thanks a lot!
On-Page Optimization | | GiftReg0 -
Duplicate content and the Moz bot
Hi Does our little friend at SEOmoz follow the same rules as the search engine bots when he crawls my site? He has sent thousands of errors back to me with duplicate content issues, but I thought I had removed these with nofollow etc. Can you advise please.
On-Page Optimization | | JamieHibbert0 -
Duplicate Page Content Issues
How can I fix Duplicate Page Content Issues on my site : www.ifocalmedia.com. This is a WP site and the diagnostics shows I have 115 errors? I know this is damaging to my SEO campaign how do I clear these? Any help is very welcome.
On-Page Optimization | | shami0 -
Should I make All My "Non-Money" Pages No-Follow?
I'm branching out here from my novice seo status . . . In an effort to channel page rank to the pages I wish to rank for should I make all my non-money pages no-follow. Pages like "contact us", "about us", "application", etc. It seems to make sense to make these no follow so the page rank flows to the pages I wish to rank for. Am I on the right track?
On-Page Optimization | | leaseman0