150+ Pages of URL Parameters - Mass Duplicate Content Issue?
-
Hi we run a large e-commerce site and while doing some checking through GWT we came across these URL parameters and are now wondering if we have a duplicate content issue.
If so, we are wodnering what is the best way to fix them, is this a task with GWT or a Rel:Canonical task?
Many of the urls are driven from the filters in our category pages and are coming up like this: page04%3Fpage04%3Fpage04%3Fpage04%3F (See the image for more).
Does anyone know if these links are duplicate content and if so how should we handle them?
Richard
-
Hi Richard
Honestly, I really don't know. A lot of me wants to say that: "Surely Google will know this isn't deliberate and manipulative duplicate content". You could take a couple of those URLs and do a Google search with them. Do:
site:www.example.com/page?query1
info:www.example.com/page?query1With the first result, if your URL hasn't been indexed, that's a good thing. For the second result, if the info search returns the original URL (without the parameters), that's also good, as it means Google will be counting the one with parameters as just a variation and to be ignored. However, if it's returning the result with the parameters, that would indicate that the web crawler is indexing the version with parameters and treating it as a separate URL - raising the duplicate content risk. Silly Google!
Regardless of those results, I would look to implement the canonical tag anyway as it takes any guesswork out of the equation. And ultimately, a lot of this work with Google is guesswork as we can't see the algorithm - although it's an informed guess due to experience etc.
-
Thanks for this Tom, great answer!
So am I right in thinking that each of these URL Parameters are very likely being classed as duplicate content?
-
Along with this great answer from Tom, I just wanted to add that Google does offer a resource on duplicate content as well with tips.
Hope this helps as well - good luck!
-
Hi Richard
It is something you should address ASAP. While I believe that Google is a lot better at recognising 'accidental' duplicate content - IE URLs with URL parameters - and distinguishing it from 'deliberate' duplicate content - just outright stealing someone's work or trying to rank several pages for multiple terms - that is only my assumption. To be completely sure, let's stop any chance of Google penalising these pages.
I think, in this instance, a rel canonical tag should do the trick. You can read more on the tag here in Moz's guide. Basically, on the page(s) where you're having this problem add a "self-referring" canonical tag. For example, if the page was http://www.example.com/blue-widgets/, the tag would be:
Make sure that, when you implement this, the pages that are generated with the URL parameters aren't also creating canonical tags like:
They should all have the original canonical tag.
What this will do is tell Google that "If you see any pages with this tag, we're aware that they might be duplicate, but please only count and index the http://www.example.com/blue-widgets/". It works just like a 301 redirect in that sense.
I think this would be the simplest solution for you to implement. If you're having problems, there would be a way of blocking access to pages with certain query/URL parameters by using the robots.txt file, but that could get quite messy.
Hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does Google add parameters to the URL parameters in webmaster tools/
I am seeing new parameters added (and sometimes removed) from the URL Parameter tool. Is there anything that would add parameters to the tool? Or does it have to be someone internally? FYI - They always have no date in the configured column, no effect set, and crawl is set to Let Google decide.
Technical SEO | | merch_zzounds0 -
Duplicated rel=author tags (x 3) on WordPress pages, any issue with this?
Hi,
Technical SEO | | jeffwhitfield
We seem to have duplicated rel=author tags (x 3) on WordPress pages, as we are using Yoast WordPress SEO plugin which adds a rel=author tag into the head of the page and Fancier Author Box plugin which seems to add a further two rel=author tags toward the bottom of the page. I checked the settings for Fancier Author Box and there doesn't seem to be the option to turn rel=author tags off; we need to keep this plugin enabled as we want the two tab functionality of the author bio and latest posts. All three rel=author tags seem to be correctly formatted and Google Structured Data Testing Tool shows that all authorship rel=author markup is correct; is there any issue with having these duplicated rel=author tags on the WordPress pages?
I tried searching the Q&A but couldn't find anything similar enough to what I'm asking above. Many thanks in advance and kind regards.0 -
Image centric site and duplicate content issues
We have a site that has very little text, the main purpose of the site is to allow users to find inspiration through images. 1000s of images come to us each week to be processed by our editorial team, so as part of our process we select a subset of the best images and process those with titles, alt text, tags, etc. We still host the other images and users can find them through galleries that link to the process and unprocessed image pages. Due to the lack of information on the unprocessed images, we are having lots of duplicate content issues (The layout of all the image pages are the same, and there isn't any unique text to differentiate the pages. The only changing factor is the image itself in each page) Any suggestions on how to resolve this issue, will be greatly appreciated.
Technical SEO | | wedlinkmedia0 -
Page URL Change
We're planning on rolling out a redesign of an existing page, and at the same time, we're looking to possibly changing the URL of the page. Currently, the URL is www.blah.com/phraseword1-phraseword2-phraseword3-phraseword4 and we're ranking top 3 in Google SERP for that 4-word phrase. The keyword phrase is something we have in our Page Title, Site Copy and the URL. Now, we are planning on simplifying the URL to below.. www.blah.com/phraseword1-phraseword2 The plan is to 301 redirect the original URL to this new URL and actually work the exact phrase into the copy a few more times. My understanding is that URL doesn't get as much weight as it does in the past, but it's still important. So my question is... How important is the URL in this case where we will continue to have it in our page title and also we'll be working more copy on to the page with the appropriate keyword? Will 301 redirect from the old URL address the issue of passing SEO value for that keyword phrase? Thanks,
Technical SEO | | JoeLin
Joe0 -
Caps in URL creating duplicate content
Im getting a bunch of duplicate content errors where the crawl is saying www.url.com/abc has duplicate at www.url.com/ABC The content is in magento and the url settings are lowercase, and I cant figure out why it thinks there is duplicate consent. These are pages with a decent number of inbound links.
Technical SEO | | JohnBerger0 -
Duplicate content error - same URL
Hi, One of my sites is reporting a duplicate content and page title error. But it is the same page? And the home page at that. The only difference in the error report is a trailing slash. www.{mysite}.co.uk www.{mysite}.co.uk/ Is this an easy htaccess fix? Many thanks TT
Technical SEO | | TheTub1 -
Ignore url parameters without the 'parameter=' ?
We are working on an ecommerce site that sorts out the products by color and size but doesn't use the sortby= but uses sortby/. Can we tell Google to ignore the sortby/ parameter in Webmaster Tools even though it is not followed by an = sign? For example: www.mysite.com/shirts/tshirts/shopby/size-m www.mysite.com/shirts/tshirts/shopby/color-black Can we tell WMT to ignore the 'shopby/' parameter so that only the tshirts page will be indexed? Or does the shopby have to be set up as 'shopby=' ? Thanks!
Technical SEO | | Hakkasan0 -
Duplicate content
Greetings! I have inherited a problem that I am not sure how to fix. The website I am working on had a 302 redirect from its original home url (with all the link juice) to a newly designed page (with no real link juice). When the 302 redirect was removed, a duplicate content problem remained, since the new page had already been indexed by google. What is the best way to handle duplicate content? Thanks!
Technical SEO | | shedontdiet0