Masses (5,168 issues found) of Duplicate content.
-
Hi Mozzers,
I have a site that has returned 5,168 issues with duplicate content.
Where would you start?
I started sorting via High page Authority first the highest being 28 all the way down to 1. I did want to use the rel=canonical tag as the site has many redirects already.
The duplicates are caused by various category and cross category pages and search results such as ....page/1?show=2&sort=rand.
I was thinking of going down the lines of a URL rewrite and changing the search anyway. Is it work redirecting everything in terms of results versus the effort of changing all the 5,168 issues?
Thanks
sm
-
Hi Guys,
Thanks for the responses I'm going to have a look at the issue again, with your suggestions in mind. And I'll keep you posted. Thanks again.
-
Don't look at individual URLs - at the scale of 5K plus, look at your site architecture and what kind of variants you're creating. For example, if you know that the show= and sort= parameter are a possible issue, you could go to Google and enter something like:
site:example.com inurl:show=
(warning: it will return pages with the word "show" in the URL, like "example.com/show-times" - not usually an issue, but it can be on rare occasion).
That'll give you a sense of how many cases that one parameter is creating. Odds are, you'll find a couple that are causing 500+ of the 5K duplicates, so start with those.
Search pagination is very tricky - you could canonicalize to "View All" as Chris Hill said, you could NOINDEX pages 2+, or you could try Google's new (but very complicated way):
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
Problem is, that doesn't work on Bing and it's pretty easy to mess up.
The rel-canonical tag can scoop up sorts pretty well. You can also tell Google in Google Webmaster Tools what those parameters do, and whether to index them, but I've had mixed luck with that. If you're not having any serious problems, GWT is easy and worth a shot.
-
Have a look at your pagination too. If you've not got a 'show all' link it might be worth putting one in and making that the canonical. Should eliminate some of your duplicate content issues.
-
Last I came accross such an issue I mostly started with making the 'easy' changes that reduced the number the most.
In the last case, it was implimenting a 301 to the www version of the site (cutting the errors in half) and putting a canonical on one search page.
This got the number down to the point where it was easyer to make decisions on 'Is it worth making friendlyer urls' and discover more intresting places dup content was being generated.
It's one of these things I would always aim for 0 where I can. It usualy means that the url or site structure can be improved sugnificantly, or it's such an easy fix that it's hard to justify not doing.
-
If it really is a URL issue then you should just be able to easily canonical the root pages and the rest should sort itself out. Start there and let the next spidering tell you where you stand.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content/Similar Pages
Hello, I'm working on our site and I'm coming into an issue with the duplicate content. Our company manufactures heavy-duty mobile lifts. We have two main lifts. They are the same, except for capacity. We want to keep the format similar and the owner of the company wants each lift to have its own dedicated page. Obviously, since the layout is the same and content is similar I'm getting the duplicate content issue. We also have a section of our accessories and a section of our parts. Each of these sections have individual pages for the accessory/part. Again, the pages are laid out in a similar fashion to keep the cohesiveness, and the content is different, however similar. Meaning different terminology, part numbers, stock numbers, etc., but the overall wording is similar. What can I do to combat these issues? I think our ratings are dropping due to the duplicate content.
Technical SEO | | slecinc0 -
Shopify Duplicate Content in products
Hello Moz Community, New to Moz and looking forward to beginning my journey towards SEO education and improving our clients' sites. Our client's website is a Shopify store. https://spiritsofthewestcoast.com/ Our first Moz reports show 686 duplicate content issues. I will show the first 4 as examples. https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/haida-eagle-teardrop-earrings https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/haida-orca-silver-earrings https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/silver-oval-earrings https://spiritsofthewestcoast.com/collections/native-earrings-and-studs-in-silver-and-gold/products/haida-eagle-spirit-silver-earrings As you can see, URL titles are unique. But I know that the content in each of those products have very similar product descriptions but not exactly. But since they have been flagged as a site issue by Moz, I am guessing that the content is 95% duplicate. So can a rel=canonical be the right solution for this type of duplicate content? Or should I be considering adding new content to each of 686 products to drop below the 95% threshold? Or another solution that I may not be aware of. Thanks in advance for your assistance and expertise! Sean
Technical SEO | | TheUpdateCompany1 -
Http v https Duplicate Issues
Hello, I noticed earlier an issue on my site. http://mysite.com and https://mysite.com both had canonical links pointing to themselves so in effect creating duplicate content. I have now taken steps to ensure the https version has a canonical that points to the http version but I was wondering what other steps would people recommend? Is it safe to NOINDEX the https pages? Or block them via robots.txt or both? We are not quite ready to go fully HTTPS with our site yet (I know Google now prefers this) Any thoughts would be very much appreciated.
Technical SEO | | niallfred0 -
Duplicate content warning for a hierarchy structure?
I have a series of pages on my website organized in a hierarchy, let's simplify it to say parent pages and child pages. Each of the child pages has product listings, and an introduction at the top (along with an image) explaining their importance, why they're grouped together, providing related information, etc.
Technical SEO | | westsaddle
The parent page has a list of all of its child pages and a copy of their introductions next to the child page's title and image thumbnail. Moz is throwing up duplicate content warnings for all of these pages. Is this an actual SEO issue, or is the warning being overzealous?
Each child page has tons of its own content, and each parent page has the introductions from a bunch of child pages, so any single introduction is never the only content on the page. Thanks in advance!0 -
Minimising the effects of duplicate content
Hello, We realised that one of our clients, copied a large part of content from our website to his. The normal reaction would be to send a cease and desist letter. Nevertheless this would probably mean loosing a good client. The client dumped the text of several articles (for example:
Technical SEO | | Lvet
http://www.velascolawyers.com/en/property-law/136-the-ley-de-costas-coastal-law.html ) Into the same page:
http://www.freundlinger-partners.com/en/home/faqs-property-law/ I convinced the client to place our authorship tags on this page, but I am wondering if this is enough. What do you think? Cheers
Luca0 -
404 and Duplicate Content.
I just submitted my first campaign. And it's coming up with a LOT of errors. Many of them I feel are out of my control as we use a CMS for RV dealerships. But I have a couple of questions. I got a 404 error and SEO Moz tells me the link, but won't tell me where that link originated from, so I don't know where to go to fix it. I also got a lot of duplicate content, and it seems a lot of them are coming from "tags" on my blog. Is that something I should be concerned about? I will have a lot more question probably as I'm new to using this tool Thanks for the responses! -Brandon here is my site: floridaoutdoorsrv.com I welcome any advice or input!
Technical SEO | | floridaoutdoorsrv0 -
Are aggregate sites penalised for duplicate page content?
Hi all,We're running a used car search engine (http://autouncle.dk/en/) in Denmark, Sweden and soon Germany. The site works in a conventional search engine way with a search form and pages of search results (car adverts).The nature of car searching entails that the same advert exists on a large number of different urls (because of the many different search criteria and pagination). From my understanding this is problematic because Google will penalize the site for having duplicated content. Since the order of search results is mixed, I assume SEOmoz cannot always identify almost identical pages so the problem is perhaps bigger than what SEOmoz can tell us. In your opinion, what is the best strategy to solve this? We currently use a very simple canonical solution.For the record, besides collecting car adverts AutoUncle provide a lot of value to our large user base (including valuations on all cars) . We're not just another leech adword site. In fact, we don't have a single banner.Thanks in advance!
Technical SEO | | JonasNielsen0 -
Duplicate content issue index.html vs non index.html
Hi I have an issue. In my client's profile, I found that the "index.html" are mostly authoritative than non "index.html", and I found that www. version is more authoritative than non www. The problem is that I find the opposite situation where non "index.html" are more authoritative than "index.html" or non www more authoritative than www. My logic would tell me to still redirect the non"index.html" to "index.html". Am I right? and in the case I find the opposite happening, does it matter if I still redirect the non"index.html" to "index.html"? The same question for www vs non www versions? Thank you
Technical SEO | | Ideas-Money-Art0