Update in Moz spider/tools?? Flagging duplicate content / ignoring canonical
-
Hi all,
Has there been an update in the SEOmoz crawling software?
We now have thousands of dupe content/page title warnings for paginated product page URLs that have correctly formatted canonicals.
e.g.
http://www.woolovers.com/british-wool/mens/tweed-green/wool-countryman-suede-patch-sweater.aspx
... has following pages with identical content that have been flagged:
http://www.woolovers.com/british-wool/mens/olive-green/wool-countryman-suede-patch-sweater.aspx?p=true&rspage=4
..plus 4 more URL's.
But they all have canonical set. There's even a notice at the bottom of report that tells us there's a canonical set to http://www.woolovers.com/british-wool/mens/tweed-green/wool-countryman-suede-patch-sweater.aspx
What gives, SEOmoz ??
Thanks
Michael
-
Hey Lawrence,
Campaigns have a 95% tolerance for duplicate content. This includes all the source code on the page and not just the viewable text. So if a URL is at least 95% similar in code and content to another URL, this warning will appear.
You can run your own tests using this tool: http://www.webconfs.com/similar-page-checker.php
We don't know what standard Google uses, but it's safe to say they are a bit more sophisticated than us - so you might be okay in this regard as long as you have a couple hundred words of unique text and some unique coding per page. Google won't say how much duplicate content is too much, so we like to be better safe than sorry.
I hope this help. Let me know if you need further assistance.
-Chiaryn
-
Hi Chiaryn,
Thanks for reply and explanation. The different colour-specific pages e.g. Tweed Green and Olive Green have some different content but it's nothing like enough in cases of two greens, two blues etc. as we simplify colour names for search so when there is an Olive and a Tweed Green they both end up having 'Green' as variable in page title, H1 etc. Will fix this.
Do you think the reviews at the bottom of the pages will also trigger dupe content warning? i.e. even if we make all other on-page elements unique for each colour url? (page title, H1, H2, prod description etc) The reviews are quite extensive and are the same on all the separate colour specific product page versions of each style and was thinking today whether we should remove them from these colour product pages (OR perhaps let the colour product pages have their OWN reviews)
http://www.woolovers.com/british-wool/mens/tweed-green/wool-countryman-suede-patch-sweater.aspx
Thanks again
-
Oh, brilliant (re: "See more" aspect) Thanks for the info. Will let you how we tackle this and the repercussions (!) and look forward to hearing how you get on also!
-
Hi Michael,
Thanks for writing in. I already emailed you in response to the ticket you sent in to the Help Desk, but I will copy my answer here for you review.
--
I looked into your campaign and it seems that this is happening because of where your canonical tags are pointing. These pages are considered duplicates because their canonical tags point to different URLs. For example, http://www.woolovers.com/british-wool/mens/tweed-green/wool-countryman-suede-patch-sweater.aspx is considered a duplicate of http://www.woolovers.com/british-wool/mens/olive-green/wool-countryman-suede-patch-sweater.aspx?p=true&rspage=4 because the canonical tag for the first page is http://www.woolovers.com/british-wool/mens/tweed-green/wool-countryman-suede-patch-sweater.aspx while the canonical for the second URL ishttp://www.woolovers.com/british-wool/mens/olive-green/wool-countryman-suede-patch-sweater.aspx, with one URL showing tweed-green and the other showing olive-green.
Since the canonical tags point to different URLs it is assumed that http://www.woolovers.com/british-wool/mens/tweed-green/wool-countryman-suede-patch-sweater.aspx and http://www.woolovers.com/british-wool/mens/olive-green/wool-countryman-suede-patch-sweater.aspx are likely to be duplicates themselves.
Here is how our system interprets duplicate content vs. rel canonical:
Assuming A, B, C, and D are all duplicates,
If A references B as the canonical, then they are not considered duplicates
If A and B both reference C as canonical, A and B are not considered duplicates of each other
If A references C as a canonical, A and B are considered duplicated
If A references C as canonical, B references D, then A and B are considered duplicates
The examples you've provided actually fall into the fourth example I've listed above.I hope this clears things up. Please let me know if you have any other questions.
--
-Chiaryn
-
We use the "See more" script on our sites, and from what I understand, at least from other Mozzers, this is an okay practice. http://www.seomoz.org/q/using-more-info-javascript-toggledisplay-tag-for-more-info-text
We also use the rel="prev" and rel="next" to some success, but I can't comment on how that's functioning canonical-wise, because IT WAS DROPPED from our latest redesign and is going to be added to our client's website in the latest release. Oye.
I'd love to hear how this works out for you. There are some really great Mozzers on here with loads of experience about canonical tags and duplicate page issues. Can't wait to see what they have to contribute.
-
Hi there,
Thanks for your response.
It's not product page A being seen as a duplicate of product page B etc, but several versions of product A seen as duplicate due to pagination, stemming from reviews for the products that span several pages, so making the rest of the content, titles etc different other than the (crawlable) reviews isn't really an option.
Will look more into "noindex, follow" tags in pagination.
We could have a View All page for indexing showing all reviews (with lots of scrolling!) , with the paginated versions canonicalized to that version (could still serve the paginated version of product page from site navigation perhaps with "noindex, follow" meta tag) Text doesn’t take long to load and this approach would consolidate the review content.
http://googlewebmastercentral.blogspot.co.uk/2011/09/view-all-in-search-results.html
Other option is to use rel=”prev” and rel=”next” implementation which shows Google the relationship between the pages (not sure if it will still be flagged as dupe content in SEOmoz though! Depends if they follow the tag). This way individual pages might get indexed (not sure if that's a good thing?!) perhaps if there's something in a review from (say) page 5 of the product reviews.
http://googlewebmastercentral.blogspot.co.uk/2011/09/pagination-with-relnext-and-relprev.html
Ideally I'd like to implement all reviews on one page and hide them with a facebook-style 'See more' function. Not sure if that counts as hiding content? Will look into this.
-
Hi Michael,
Not sure if this helps you out at all, but I found this about the canonicals and SEOMoz crawl report in a previous Q http://mz.cm/11erRj6:
As far as the SEOmoz crawl reports go, not that setting a canonical won't stop these pages being reported as duplicate content.
From the help:
"Keep in mind that that canonicals will stop the pages from ranking against each other, but they will still show up as duplicate content from a UI perspective, so we will still count them as duplicate."
I have the same issues on my accounts. I'm focusing on making the pages content as unique as possible, or using the "noindex, follow" meta tags to see if that makes a difference.
I know you may have a lot of pages on your website, but perhaps writing short descriptions on your products would help. It might be worthwhile, but completely understandable that it may be a huge undertaking if you have hundreds or thousands of pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When Should I Ignore Moz's Report Canonical Missing?
I'm dealing with an eCommerce website which has a category, subcategory, products. Moz is showing all of these and the individual products as missing a canonical. The site is very thin on content at the moment, but all the pages are clearly different, and I don't see why they need a canonical unless this is some rule that eCommerce sites have to follow. Should I ignore Moz's missing canonical report? My understanding is if the product appears in multiple categories, then a canonical should be put in place to the product. Any advice would be appreciated. Christina
Moz Pro | | ChristinaRadisic0 -
Moz Data Issues?
Since the launch of Moz something or other has been wrong with my data. Is everyone having these issues? Or is it just me?
Moz Pro | | EcommerceSite0 -
Duplicate content in crawl despite canonical
Hi! I've had a bunch of duplicate content issues come up in a crawl, but a lot of them seem to have canonical tags implemented correctly. For example: http://www.alwayshobbies.com/brands/aztec-imports/-catg=Fireplaces http://www.alwayshobbies.com/brands/aztec-imports/-catg=Nursery http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables?page=0 Aztec http://www.alwayshobbies.com/brands/aztec-imports/-catg=Turntables?page=1 Any ideas on what's happening here?
Moz Pro | | neooptic0 -
Duplicate Content, Canonicalization may not work in our scenario.
I'm new to SEO (so please excuse the lack of terminology), and will be taking over our companies inbound marketing completely, I previously just did data analysis and managed our PPC campaigns within Google and Bing/Yahoo, now I get all three, Yipee! But I digress. Before I get started here, I did read: http://moz.com/community/q/new-client-wants-to-keep-duplicate-content-targeting-different-cities?sort=most_helpful and I found both the answers there to be helpful, but indirect for my scenario. I'm conducting our companies first real SEO audit (thanks MOZ for the guide there), and duplicate content is going to be our number one problem to tackle. Our companies website was designed back in 2009, with the file structure /city-name/product-name. The problem with this is, we are open in over 50 cities now (and headed to 100 fast), and we are starting to amass duplicate content. Five products (and expanding), times the locations... you get it. My Question(s): How should I deal with this? The pages are almost identical, except listing the different information for each product depending upon it's location. However, for one of our products, Moz's own tools (PRO) did not find all the duplicate content, but did find some (I'm assuming it's because the pages have different course options and the address for the course is different, boils down to a different address on the very bottom of the body and different course options on the right sidebar). The other four products duplicate content were found and marked extensively. If I choose to use Canonicalization to link all the pages to one main page, I believe that would pass all the link juice to that one page, but we would no longer show in a Google search for the other cities, ex: washington DC example product name. Correct me if I'm wrong here. **Should I worry about the product who's duplicate content only was marked four times out of fifty cities? **I feel as if this question answers itself, but I still would like to have someone who knows more than me shed some light on this issue. The other four products are not going to be an issue as they are only offered online, but still follow the same file structure with /online in place of /city-name. These will be Canonicalized together under the /online location. One last thing I will mention here, having the city name in the url gives us a nice advantage (I think) when people are searching for products in cities we offer our product. (correct me again) If this is not the case, I believe I could talk our team into restructuring the files (if you think that's our best option). Some things you need to know about our site: We use a cookie for the location. Once you land on a page that has a location tied to it, the cookie is updated and saved. If the location does not exist, then you are redirected to a page to chose a location. I'm pretty sure this can cause some SEO issues too, but once again not sure. I know this is a wall of text, but I cannot tell you enough how appreciative I am in advance for your informative answers. Thanks a million, Trenton
Moz Pro | | PM_Academy0 -
My moz only one page was crawled
I recently moved my shopping cart from one provider to another and today moz only crawled one page, could this be because maybe google has not indexed it yet or should i be concerned? I pointed the DNS at the new cart monday night if that helps. I would have expected it to be indexed by now
Moz Pro | | SmartVapes0 -
Redirect analysis tool
I'm looking for a tool like this: http://www.internetofficer.com/seo-tool/redirect-check/ that can check hundreds/thousands of URLs and give me a report as to which ones have been redirected. Does anyone know of something that can do this?
Moz Pro | | glass010 -
Crawl reports urls with duplicate content but its not the case
Hi guys!
Moz Pro | | MakMour
Some hours ago I received my crawl report. I noticed several records with urls with duplicate content so I went to open those urls one by one.
Not one of those urls were really with duplicate content but I have a concern because website is about product showcase and many articles are just images with href behind them. Many of those articles are using the same images so maybe thats why the seomoz crawler duplicate content flag is raised. I wonder if Google has problem with that too. See for yourself how it looks like: http://by.vg/NJ97y
http://by.vg/BQypE Those two url's are flagged as duplicates...please mind the language(Greek) and try to focus on the urls and content. ps: my example is simplified just for the purpose of my question. <colgroup><col width="3436"></colgroup>
| URLs with Duplicate Page Content (up to 5) |0 -
How often do SEOmoz update the PA?
How often do SEOmoz index backlinks? I have got the same amount of backlinks now for nearly a month. Thanks!
Moz Pro | | SWK0