Masses (5,168 issues found) of Duplicate content.
-
Hi Mozzers,
I have a site that has returned 5,168 issues with duplicate content.
Where would you start?
I started sorting via High page Authority first the highest being 28 all the way down to 1. I did want to use the rel=canonical tag as the site has many redirects already.
The duplicates are caused by various category and cross category pages and search results such as ....page/1?show=2&sort=rand.
I was thinking of going down the lines of a URL rewrite and changing the search anyway. Is it work redirecting everything in terms of results versus the effort of changing all the 5,168 issues?
Thanks
sm
-
Hi Guys,
Thanks for the responses I'm going to have a look at the issue again, with your suggestions in mind. And I'll keep you posted. Thanks again.
-
Don't look at individual URLs - at the scale of 5K plus, look at your site architecture and what kind of variants you're creating. For example, if you know that the show= and sort= parameter are a possible issue, you could go to Google and enter something like:
site:example.com inurl:show=
(warning: it will return pages with the word "show" in the URL, like "example.com/show-times" - not usually an issue, but it can be on rare occasion).
That'll give you a sense of how many cases that one parameter is creating. Odds are, you'll find a couple that are causing 500+ of the 5K duplicates, so start with those.
Search pagination is very tricky - you could canonicalize to "View All" as Chris Hill said, you could NOINDEX pages 2+, or you could try Google's new (but very complicated way):
http://googlewebmastercentral.blogspot.com/2011/09/pagination-with-relnext-and-relprev.html
Problem is, that doesn't work on Bing and it's pretty easy to mess up.
The rel-canonical tag can scoop up sorts pretty well. You can also tell Google in Google Webmaster Tools what those parameters do, and whether to index them, but I've had mixed luck with that. If you're not having any serious problems, GWT is easy and worth a shot.
-
Have a look at your pagination too. If you've not got a 'show all' link it might be worth putting one in and making that the canonical. Should eliminate some of your duplicate content issues.
-
Last I came accross such an issue I mostly started with making the 'easy' changes that reduced the number the most.
In the last case, it was implimenting a 301 to the www version of the site (cutting the errors in half) and putting a canonical on one search page.
This got the number down to the point where it was easyer to make decisions on 'Is it worth making friendlyer urls' and discover more intresting places dup content was being generated.
It's one of these things I would always aim for 0 where I can. It usualy means that the url or site structure can be improved sugnificantly, or it's such an easy fix that it's hard to justify not doing.
-
If it really is a URL issue then you should just be able to easily canonical the root pages and the rest should sort itself out. Start there and let the next spidering tell you where you stand.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content issue
Hi, A client of ours has one URL for the moment (https://aalst.mobilepoint.be/) and wants to create a second one with exactly the same content (https://deinze.mobilepoint.be/). Will that mean Google punishes the second one because of duplicate content? What are the recommendations?
Technical SEO | | conversal0 -
Duplicate Content from long Site Title
Hello! I have a number of "Duplicate Title Errors" as my website has a long Site Title: Planit NZ: New Zealand Tours, Bus Passes & Travel Planning. Am I better off with a short title that is simply my website/business name: Planit NZ My thought was adding some keywords might help with my rankings. Thanks Matt
Technical SEO | | mkyhnn0 -
150+ Pages of URL Parameters - Mass Duplicate Content Issue?
Hi we run a large e-commerce site and while doing some checking through GWT we came across these URL parameters and are now wondering if we have a duplicate content issue. If so, we are wodnering what is the best way to fix them, is this a task with GWT or a Rel:Canonical task? Many of the urls are driven from the filters in our category pages and are coming up like this: page04%3Fpage04%3Fpage04%3Fpage04%3F (See the image for more). Does anyone know if these links are duplicate content and if so how should we handle them? Richard I7SKvHS
Technical SEO | | Richard-Kitmondo0 -
Duplicate content and rel canonicals?
Hi. I have a question relating to 2 sites that I manage with regards to duplicate content. These are 2 separate companies but the content is off a data base from the one(in other words the same). In terms of the rel canonical, how would we do this so that google does not penalise either site but can also have the content to crawl for both or is this just a dream?
Technical SEO | | ProsperoDigital0 -
What online tools are best to identify website duplicate content (plagiarism) issues?
I've discovered that one of the sites I am working on includes content which also appears on number of other sites. I need to understand exactly how much of the content is duplicated so I can replace it with unique copy. To do this I have tried using tools such as plagspotter.com and copyscape.com with mixed results, nothing so far is able to give me a reliable picture of exactly how much of my existing website content is duplicated on 3rd party sites. Any advice welcome!
Technical SEO | | HomeJames0 -
What could be the cause of this duplicate content error?
I only have one index.htm and I'm seeing a duplicate content error. What could be causing this? IUJvfZE.png
Technical SEO | | ScottMcPherson1 -
Question about duplicate content in crawl reports
Okay, this one's a doozie: My crawl report is listing all of these as separate URLs with identical duplicate content issues, even though they are all the home page and the one that is http://www.ccisolutions.com (the preferred URL) has a canonical tag of rel= http://www.ccisolutions.com: http://www.ccisolutions.com http://ccisolutions.com http://www.ccisolutions.com/StoreFront/IAFDispatcher?iafAction=showMain I will add that OSE is recognizing that there is a 301-redirect on http://ccisolutions.com, but the duplicate content report doesn't seem to recognize the redirect. Also, every single one of our 404-error pages (we have set up a custom 404 page) is being identified as having duplicate content. The duplicate content on all of them is identical. Where do I even begin sorting this out? Any suggestions on how/why this is happening? Thanks!
Technical SEO | | danatanseo1 -
Multiple URLs in CMS - duplicate content issue?
So about a month ago, we finally ported our site over to a content management system called Umbraco. Overall, it's okay, and certainly better than what we had before (i.e. nothing - just static pages). However, I did discover a problem with the URL management within the system. We had a number of pages that existed as follows: sparkenergy.com/state/name However, they exist now within certain folders, like so: sparkenergy.com/about-us/service-map/name So we had an aliasing system set up whereby you could call the URL basically whatever you want, so that allowed us to retain the old URL structure. However, we have found that the alias does not override, but just adds another option to finding a page. Which means the same pages can open under at least two different URLs, such as http://www.sparkenergy.com/state/texas and http://www.sparkenergy.com/about-us/service-map/texas. I've tried pointing to the aliased URL in other parts of the site with the rel canonical tag, without success. How much of a problem is this with respect to duplicate content? Should we bite the bullet, remove the aliased URLs and do 301s to the new folder structure?
Technical SEO | | ufmedia0