150+ Pages of URL Parameters - Mass Duplicate Content Issue?
-
Hi we run a large e-commerce site and while doing some checking through GWT we came across these URL parameters and are now wondering if we have a duplicate content issue.
If so, we are wodnering what is the best way to fix them, is this a task with GWT or a Rel:Canonical task?
Many of the urls are driven from the filters in our category pages and are coming up like this: page04%3Fpage04%3Fpage04%3Fpage04%3F (See the image for more).
Does anyone know if these links are duplicate content and if so how should we handle them?
Richard
-
Hi Richard
Honestly, I really don't know. A lot of me wants to say that: "Surely Google will know this isn't deliberate and manipulative duplicate content". You could take a couple of those URLs and do a Google search with them. Do:
site:www.example.com/page?query1
info:www.example.com/page?query1With the first result, if your URL hasn't been indexed, that's a good thing. For the second result, if the info search returns the original URL (without the parameters), that's also good, as it means Google will be counting the one with parameters as just a variation and to be ignored. However, if it's returning the result with the parameters, that would indicate that the web crawler is indexing the version with parameters and treating it as a separate URL - raising the duplicate content risk. Silly Google!
Regardless of those results, I would look to implement the canonical tag anyway as it takes any guesswork out of the equation. And ultimately, a lot of this work with Google is guesswork as we can't see the algorithm - although it's an informed guess due to experience etc.
-
Thanks for this Tom, great answer!
So am I right in thinking that each of these URL Parameters are very likely being classed as duplicate content?
-
Along with this great answer from Tom, I just wanted to add that Google does offer a resource on duplicate content as well with tips.
Hope this helps as well - good luck!
-
Hi Richard
It is something you should address ASAP. While I believe that Google is a lot better at recognising 'accidental' duplicate content - IE URLs with URL parameters - and distinguishing it from 'deliberate' duplicate content - just outright stealing someone's work or trying to rank several pages for multiple terms - that is only my assumption. To be completely sure, let's stop any chance of Google penalising these pages.
I think, in this instance, a rel canonical tag should do the trick. You can read more on the tag here in Moz's guide. Basically, on the page(s) where you're having this problem add a "self-referring" canonical tag. For example, if the page was http://www.example.com/blue-widgets/, the tag would be:
Make sure that, when you implement this, the pages that are generated with the URL parameters aren't also creating canonical tags like:
They should all have the original canonical tag.
What this will do is tell Google that "If you see any pages with this tag, we're aware that they might be duplicate, but please only count and index the http://www.example.com/blue-widgets/". It works just like a 301 redirect in that sense.
I think this would be the simplest solution for you to implement. If you're having problems, there would be a way of blocking access to pages with certain query/URL parameters by using the robots.txt file, but that could get quite messy.
Hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do mobile and desktop sites that pull content from the same source count as duplicate content?
We are about to launch a mobile site that pulls content from the same CMS, including metadata. They both have different top-level domains, however (www.abcd.com and www.m.abcd.com). How will this affect us in terms of search engine ranking?
Technical SEO | | ovenbird0 -
Added 301 redirects, pages still earning duplicate content warning
We recently added a number of 301 redirects for duplicate content pages, but even with this addition they are still showing up as duplicate content. Am I missing something here? Or is this a duplicate content warning I should ignore?
Technical SEO | | cglife0 -
Duplicate Content
Hi, I'm working on a site and I'm having some issues with its structure causing duplicate content. The first issue is that the search pages will show up as duplicates.
Technical SEO | | OOMDODigital
A search for new inventory may be new.aspx
The duplicate may be something like new.aspx=page1, or something like that and so on. The second issue is with inventory. When new inventory gets put into the stock of the store, a new page for that item will be populated with duplicate content. There appears to be no canonical source for that page. How can I fix both of these? Thanks!0 -
Tips and duplicate content
Hello, we have a search site that offers tips to help with search/find. These tips are organized on the site in xml format with commas... of course the search parameters are duplicated in the xml so that we have a number of tips for each search parameter. For example if the parameter is "dining room" we might have 35 pieces of advice - all less than a tweet long. My question - will I be penalized for keyword stuffing - how can I avoid this?
Technical SEO | | acraigi0 -
Duplicate content in Magento
Hi all We got some serious issues with duplicate content on a Magento site that we are marketing. For example: http://www.citcop.se/varmepumpar-luft-luft/panasonic/panasonic-nordic-ce9nke-5-0kw http://www.citcop.se/panasonic/panasonic-nordic-ce9nke-5-0kw http://www.citcop.se/panasonic-nordic-ce9nke-5-0kw All of the above seem to work just fine as it is now but since they are excatly the same product they should ofcourse do a 301 redirect to the main page. Any ideas on how to sort this out in Magnto without having to resort to manual work in .htaccess? Have a great day Fredrik
Technical SEO | | Resultify0 -
Avoiding duplicate content on product pages?
Hi, I'm creating a bunch of product pages for courses for a university and I'm concerned about duplicate content penalties. While the page names are different and some of the test is different, much of the text is the same between pairs of pages. I.e. a BA and an MA in a particular subject (say 'hairdressing' will have the same subject descriptions, school introduction paragraph, industry overview paragraph etc. 1. Is this a problem? In a site with 100 pages, if sets of 2 pages have about 50% identical content... 2. If it is a problem, is there anything I can do, other than rewrite the text? 3. From a search perspective, would both pages show up in search results in searches related to 'hairdressing courses' 'study hairdressing' etc? Thanks!
Technical SEO | | AISFM0 -
Duplicate Content Issue
Very strange issue I noticed today. In my SEOMoz Campaigns I noticed thousands of Warnings and Errors! I noticed that any page on my website ending in .php can be duplicated by adding anything you want to the end of the url, which seems to be causing these issues. Ex: Normal URL - www.example.com/testing.php Duplicate URL - www.example.com/testing.php/helloworld The duplicate URL displays the page without the images, but all the text and information is present, duplicating the Normal page. I Also found that many of my PDFs seemed to be getting duplicated burried in directories after directories, which I never ever put in place. Ex: www.example.com/catalog/pdfs/testing.pdf/pdfs/another.pdf/pdfs/more.pdfs/pdfs/ ... when the pdfs are only located in a pdfs directory! I am very confused on how to fix this problem. Maybe with some sort of redirect?
Technical SEO | | hfranz0 -
Duplicate Page Titles and Content
I have a site that has a lot of contact modules. So basically each section/page has a contact person and when you click the contact button it brings up a new window with form to submit and then ends with a thank you page. All of the contact and thank you pages are showing up as duplicate page titles and content. Is this something that needs to be fixed even if I am not using them to target keywords?
Technical SEO | | AlightAnalytics0