Why does Crawl Diagnostics report this as duplicate content?
-
Hi guys,
we've been addressing a duplicate content problem on our site over the past few weeks. Lately, we've implemented rel canonical tags in various parts of our ecommerce store, over time, and observing the effects by both tracking changes in SEOMoz and Websmater tools.
Although our duplicate content errors are definitely decreasing, I can't help but wonder why some URLs are still being flagged with duplicate content by our SEOmoz crawler.
Here's an example, taken directly from our Crawl Diagnostics Report:
URL with 4 Duplicate Content errors:
/safety-lights.htmlDuplicate content URLs:
/safety-lights.html ?cat=78&price=-100
/safety-lights.html?cat=78&dir=desc&order=position /safety-lights.html?cat=78 /safety-lights.html?manufacturer=514What I don't understand, is all of the URLS with URL parameters have a rel canonical tag pointing to the 'real' URL
/safety-lights.htmlSo why is SEOMoz crawler still flagging this as duplicate content?
-
So glad I could help get this figured out! Sometimes it just takes another set of eyes.
-Chiaryn
-
Good catch Chiaryn! Totally didn't see this.
Essentially two URLs end up displaying the same content: 1 is the URL that's picked up by google from our XML sitemap, and the other is a dynamic URL with filtering parameters based on a one level higher category URL.
The canonical tags were set up in such a way that they point to the base category, which in this case, are different, even though the content is the same.
We will address this.
Thanks!
-
Hi there,
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing. These pages are considered duplicates because their canonical tags point to different URLs. For example, accessories/lights.html?cat=78&price=-100 is considered a duplicate of accessories/lights/safety-lights.html?manufacturer=514 because the canonical tag for the first page is accessories/lights.html while the canonical for the second URL is accessories/lights/safety-lights.html.
Since the canonical tags point to different pages it is assumed that accessories/lights.html and accessories/lights/safety-lights.html are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
- If A references B as the canonical, then they are not considered duplicates
- If A and B both reference C as canonical, A and B are not considered duplicates of each other
- If A references C as a canonical, A and B are considered duplicated
- If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.
I hope this clears things up. Please let me know if you have any other questions.
-Chiaryn
-
Does seem a little odd. Could you post the domain so we can have a more detailed look?
Thanks
Iain - Reload Media
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
The pages that add robots as noindex will Crawl and marked as duplicate page content on seo moz ?
When we marked a page as noindex with robots like {<meta name="<a class="attribute-value">robots</a>" content="<a class="attribute-value">noindex</a>" />} will crawl and marked as duplicate page content(Its already a duplicate page content within the site. ie, Two links pointing to the same page).So we are mentioning both the links no need to index on SE.But after we made this and crawl reports have no change like it tooks the duplicate with noindex marked pages too. Please help to solve this problem.
Moz Pro | | trixmediainc0 -
Duplicate content
Hi Since adding blog to a site semoz is reporting increased duplicate content warning on seomoz crawl error tool such as: /blog/category/easter being a duplicate of blog/2013/03 Does this type of dupe content matter ? If so how do you stop this ? Also pages and pages of dupe content reported from internal/site search results, such as: /catalogsearch/result/index/?q=mens+fashion being a duplicate of /catalogsearch/result/?q=mens+fashion Does this matter need to be fixed or since internal site search not an issue and can just ignore, if it is an issue what do you need do to fix this type of dupe content ? Cheers Dan
Moz Pro | | Dan-Lawrence0 -
Our Duplicate Content Crawled by SEOMoz Roger, but Not in Google Webmaster Tools
Hi Guys, We're new here and I couldn't find the answer to my question. Here it goes: We had SEOMoz's Roger Crawl all of our pages and he came up with quite a few erros (Duplicate Content, Duplicate Page Titles, Long URL's). Per our CTO and using our Google Webmaster Tools, we informed Google not to index those Duplicate Content Pages. For our Long URL Errors, they are redirected to SEF URL's. What we would like to know is if Roger is able to know that we have instructed Google to not index these pages. My concern is Should we still be concerned if Roger is still crawling those pages and the errors are not showing up in our Webmaster Tools Is there a way we can let Roger know so they don't come up as errors in our SEOMoz Tools? Thanks so much, e
Moz Pro | | RichSteel0 -
I have a Rel Canonical "notice" in my Crawl Diagnostics report. I'm presuming that means that the spider has detected a rel canonical tag and it is working as opposed to warning about an issue, is this correct?
I know this seems like a really dumb question but the site I'm working on is a BigCommerce one and I've been concerned about canonicalisation issues prior to receiving this report (I'm a SEOmoz pro newbie also!) and I just want to be clear I am reading this notice correctly. I presume this means that the site crawl has detected the rel canonical tag on these pages and it is working correctly. Is this correct?? Any input is much appreciated. Thanks
Moz Pro | | seanpearse0 -
Moz crawling
Hi Everyone! I'm new to the SEOMoz and wanted to find out if there is a way to decrease the waiting time for the campaign crawl. I have made a lot of changes based on the first crawl and would like to see how these are reflected on the reports, but can't until the next crawl is performed. Any help would be greatly appreciated.
Moz Pro | | coremediadesign0 -
Crawling One Page
I set up a profile for a site with many pages, opting for setting up as a root directory. When SEOMoz crawled, they only found one page. Any ideas for why this would be? Thanks!
Moz Pro | | Group160 -
Pro Report Card Question
I am getting this warning in the Pro Report Card, my site is http://www.myfairytalebooks.com which seems to have Children's and not Childrens in the title. Is it my site that needs to html encode in the title or is Report Card slightly broken? Exact Keyword Usage in Page Title Easy fix <dl> <dt>Page title</dt> <dd>"Personalized Childrens Books Kids Music CDs Baby Books & Gifts."</dd> <dt>Explanation</dt> <dd>Search engines consider the title element to be the most important place to identify keywords and associate the page with a topic and/or set of terms. SEOmoz's correlation research has also shown that rankings are heavily influenced by keyword usage in the title tag.</dd> <dt>Recommendation</dt> <dd>Employ the keyword in the page title, preferrably as the first words in the element.</dd> </dl>
Moz Pro | | DineshMistry0