150+ Pages of URL Parameters - Mass Duplicate Content Issue?
-
Hi we run a large e-commerce site and while doing some checking through GWT we came across these URL parameters and are now wondering if we have a duplicate content issue.
If so, we are wodnering what is the best way to fix them, is this a task with GWT or a Rel:Canonical task?
Many of the urls are driven from the filters in our category pages and are coming up like this: page04%3Fpage04%3Fpage04%3Fpage04%3F (See the image for more).
Does anyone know if these links are duplicate content and if so how should we handle them?
Richard
-
Hi Richard
Honestly, I really don't know. A lot of me wants to say that: "Surely Google will know this isn't deliberate and manipulative duplicate content". You could take a couple of those URLs and do a Google search with them. Do:
site:www.example.com/page?query1
info:www.example.com/page?query1With the first result, if your URL hasn't been indexed, that's a good thing. For the second result, if the info search returns the original URL (without the parameters), that's also good, as it means Google will be counting the one with parameters as just a variation and to be ignored. However, if it's returning the result with the parameters, that would indicate that the web crawler is indexing the version with parameters and treating it as a separate URL - raising the duplicate content risk. Silly Google!
Regardless of those results, I would look to implement the canonical tag anyway as it takes any guesswork out of the equation. And ultimately, a lot of this work with Google is guesswork as we can't see the algorithm - although it's an informed guess due to experience etc.
-
Thanks for this Tom, great answer!
So am I right in thinking that each of these URL Parameters are very likely being classed as duplicate content?
-
Along with this great answer from Tom, I just wanted to add that Google does offer a resource on duplicate content as well with tips.
Hope this helps as well - good luck!
-
Hi Richard
It is something you should address ASAP. While I believe that Google is a lot better at recognising 'accidental' duplicate content - IE URLs with URL parameters - and distinguishing it from 'deliberate' duplicate content - just outright stealing someone's work or trying to rank several pages for multiple terms - that is only my assumption. To be completely sure, let's stop any chance of Google penalising these pages.
I think, in this instance, a rel canonical tag should do the trick. You can read more on the tag here in Moz's guide. Basically, on the page(s) where you're having this problem add a "self-referring" canonical tag. For example, if the page was http://www.example.com/blue-widgets/, the tag would be:
Make sure that, when you implement this, the pages that are generated with the URL parameters aren't also creating canonical tags like:
They should all have the original canonical tag.
What this will do is tell Google that "If you see any pages with this tag, we're aware that they might be duplicate, but please only count and index the http://www.example.com/blue-widgets/". It works just like a 301 redirect in that sense.
I think this would be the simplest solution for you to implement. If you're having problems, there would be a way of blocking access to pages with certain query/URL parameters by using the robots.txt file, but that could get quite messy.
Hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL slash creating duplicate content
Hi All, I currently have an issue whereby by domain name (just homepage) has: mydomain.com and: mydomain.com/ Moz crawler flags this up as duplicate content - does anyone know of a way I can fix this? Thanks! Jack
Technical SEO | | Jack11660 -
Duplicate Page Content but where?
Hi All Moz is telling me I have duplicate page content and sure enough the PA MR mT are all 0 but it doesnt give me a link to this content! This is the page: http://www.orsgroup.com/index.php?page=Scanning-services But I cant find where the duplicate content is other than on our own youtube page which I will get removed here: http://www.youtube.com/watch?v=Pnjh9jkAWuA Can anyone help please? Andy
Technical SEO | | ORS-Group0 -
Duplicate page errors from pages don't even exist
Hi, I am having this issue within SEOmoz's Crawl Diagnosis report. There are a lot of crawl errors happening with pages don't even exist. My website has around 40-50 pages but SEO report shows that 375 pages have been crawled. My guess is that the errors have something to do with my recent htaccess configuration. I recently configured my htaccess to add trailing slash at the end of URLs. There is no internal linking issue such as infinite loop when navigating the website but the looping is reported in the SEOmoz's report. Here is an example of a reported link: http://www.mywebsite.com/Door/Doors/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/ btw there is no issue such as crawl error in my Google webmaster tool. Any help appreciated
Technical SEO | | mmoezzi0 -
Changed URL of all web pages to a new updated one - Keywords still pick the old URL
A month ago we updated our website and with that we created new URLs for each page. Under "On-Page", the keywords we put to check ranking on are still giving information on the old urls of our websites. Slowly, some new URLs are popping up. I'm wondering if there's a way I can manually make the keywords feedback information from the new urls.
Technical SEO | | Champions0 -
How unique does a page need to be to avoid "duplicate content" issues?
We sell products that can be very similar to one another. Product Example: Power Drill A and Power Drill A1 With these two hypothetical products, the only real difference from the two pages would be a slight change in the URL and a slight modification in the H1/Title tag. Are these 2 slight modifications significant enough to avoid a "duplicate content" flagging? Please advise, and thanks in advance!
Technical SEO | | WhiteCap0 -
Duplicate Content
Many of the pages on my site are similar in structure/content but not exactly the same. What amount of content should be unique for Google to not consider it duplicate? If it is something like 50% unique would it be preferable to choose one page as the canonical instead of keeping them both as separate pages?
Technical SEO | | theLotter0 -
Query string in url - duplicate content?
Hi everyone I would appreciate some advice on the following. I have a page which has some nice content on but it also has a search functionality. When a search is run a querystrong is run. So i will get something like mypage.php?id=20 etc. With many different url potentials, will each query string be seen as a different page? If so i don't want duplicate content. So am i best putting canonical tags in the head tags on mypage.php ? to avoid Google seeing potential duplicate content. Many thanks for all your advice.
Technical SEO | | pauledwards0 -
Duplicate content handling.
Hi all, I have a site that has a great deal of duplicate content because my clients list the same content on a few of my competitors sites. You can see an example of the page here: http://tinyurl.com/62wghs5 As you can see the search results are on the right. A majority of these results will also appear on my competitors sites. My homepage does not seem to want to pass link juice to these pages. Is it because of the high level of Dup Content or is it because of the large amount of links on the page? Would it be better to hide the content from the results in a nofollowed iframe to reduce duplicate contents visibilty while at the same time increasing unique content with articles, guides etc? or can the two exist together on a page and still allow link juice to be passed to the site. My PR is 3 but I can't seem to get any of my internal pages(except a couple of pages that appear in my navigation menu) to budge of the PR0 mark even if they are only one click from the homepage.
Technical SEO | | Mulith0