Why is Google Reporting big increase in duplicate content after Canonicalization update?
-
Our web hosting company recently applied a update to our site that should have rectified Canonicalized URLs. Webmaster tools had been reporting duplicate content on pages that had a query string on the end.
After the update there has been a massive jump in Webmaster tools reporting now over 800 pages of duplicate content, Up from about 100 prior to the update plus it reporting some very odd pages (see attached image)
They claim they have implement Canonicalization in line with Google Panda & Penguin, but surely something is not right here and it's going to cause us a big problem with traffic.
Can anyone shed any light on the situation???
-
Hi All,
I finally got to the bottom of the problem and it is that they have not applied canonicalization across the site, only to certain pages which is not my understanding when they implemented the update a few weeks back.
So they are preparing a hot fix as part of a service pack to our site which will rectify this issue and apply canonicalization to all pages that contain query strings. This should clear that problem up once and for all.
Thank you both for your input, a great help.
-
Hi Deb... I have nice blogpost from seomoz blog for you written by Lindsey in which she has explained it very nicely about it.
http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions
In this post check the example of digg.com. Digg.com has blocked "submit" in robots.txt but still Google has indexed URLs. Check screenshot in the Blog post. Hope this help.
-
_Those URLs will be crawled by Google, but will not be Indexed. And that being said, there will be no more duplicate content issue. I hope I have made myself clear over here. _
-
Deb, even if you block those URLs in Robots.txt, Google will going to index those URLs because those URLs are interlink with website. The best way is to put canonical tag so that you will get inter linking benefits as well.
-
Fraser,
Till now they have not implemented Canonicalization in your website. After Canonicalization implementation also you will duplication errors in your webmaster account but it will not harm your ranking. Because Canonicalization helps Google in selecting the page from multiple version of similar page that has to displayed in SERP. In above example, First URL is the original URL but the second URL has some parameters in URLs so your preferred version of URL should be first one. After proper Canonicalization implementation you will only see URLs that you have submitted in your sitemap via Google Webmaster Tool.
And about two webmaster codes, I don't think we have setup two separate accounts, you can provide view or admin access from your webmaster account to them.
-
Either you will have to block these pages via Google Webmaster Tools by Using URL parameter or else you need to block them via robots.txt file like this –
To block this URL: http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
You need to use this tag in robots.txt file – Disallow: /.htm?dir=
-
Hi,
Here are a couple of examples for you.
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100 ```
-
The Canonical URL updates were supposed to have been implement some weeks back.
I have asked why there are 2 webmaster tools codes, I expect this is my account plus they have one to monitor things there end.
Query string parameters have been setup, but I am unsure if they are configured correctly as this is all a bit new to me and i am in there hands to deal with this really.
The URLs without query strings are submitted to Webmaster tools via site maps and they are the URLs we want indexed.
-
Can you please share the URL and some example pages where the problem of duplicate content is appearing?
-
Hi Fraser,
Are you talking about towelsrus.co.uk ? I didn't find any canonical tag in any source page of your website. Are they sure about implementation ? or they will implement it in future. And one more interesting point, why there are two webmaster code in your website's source page. Below are those to webmaster codes:
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">BJ6cDrRRB2iS4fMx2zkZTouKTPTpECs2tw-3OAvIgh4</a>" />
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">SjaHRLJh00aeQY9xJ81lorL_07UXcCDFgDFgG8lBqCk</a>" />
Have you blocked querystring parameters in "URL parameters" in Google webmaster
Tools ?
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
No canonical tag found on above URLs as well.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How does google treat dynamically generated content on a page?
I'm trying to find information on how google treats dynamically generated content within a webpage? (not dynamic urls) For example I have a list of our top 10 products with short product descriptions and links on our homepage to flow some of the pagerank to those individual product pages. My developer wants to make these top products dynamic to where they switch around daily. Won't this negatively affect my seo and ability to rank for those keywords if they keep switching around or would this help since the content would be updated so frequently?
Intermediate & Advanced SEO | | ntsupply0 -
How to update campaign product on google merchant xml feed?
Hello Guys, I am using google merchant xml which updates on every morning 12 am, now many times we run campaign which starts at 8 am and for that product price differ due to that google reject such products. So my query is, is here anyway to set campaign start and end time in xml? Note - In API we can update product price anytime but in xml it can be done at one time only that is 12 am only. right?
Intermediate & Advanced SEO | | varo0 -
Copying my Facebook content to website considered duplicate content?
I write career advice on Facebook on a daily basis. On my homepage users can see the most recent 4-5 feeds (using FB social media plugin). I am thinking to create a page on my website where visitors can see all my previous FB feeds. Would this be considered duplicate content if I copy paste the info, but if I use a Facebook social media plugin then it is not considered duplicate content? I am working on increasing content on my website and feel incorporating FB feeds would make sense. thank you
Intermediate & Advanced SEO | | knielsen0 -
How to Remove Joomla Canonical and Duplicate Page Content
I've attempted to follow advice from the Q&A section. Currently on the site www.cherrycreekspine.com, I've edited the .htaccess file to help with 301s - all pages redirect to www.cherrycreekspine.com. Secondly, I'd added the canonical statement in the header of the web pages. I have cut the Duplicate Page Content in half ... now I have a remaining 40 pages to fix up. This is my practice site to try and understand what SEOmoz can do for me. I've looked at some of your videos on Youtube ... I feel like I'm scrambling around to the Q&A and the internet to understand this product. I'm reading the beginners guide.... any other resources would be helpful.
Intermediate & Advanced SEO | | deskstudio0 -
Category Content Duplication
Does indexing category archive page for a blog cause duplications? http://www.seomoz.org/blog/setup-wordpress-for-seo-success After reading this article I am unsure.
Intermediate & Advanced SEO | | SEODinosaur0 -
Virtual Domains and Duplicate Content
So I work for an organization that uses virtual domains. Basically, we have all our sites on one domain and then these sites can also be shown at a different URL. Example: sub.agencysite.com/store sub.brandsite.com/store Now the problem comes up often when we move the site to a brand's URL versus hosting the site on our URL, we end up with duplicate content. Now for god knows what damn reason, I currently cannot get my dev team to implement 301's but they will implement 302's. (Dont ask) I also am left with not being able to change the robots.txt file for our site. They say if we allowed people to go in a change this stuff it would be too messy and somebody would accidentally block a site that was not supposed to be blocked on our domain. (We are apparently incapable toddlers) Now I have an old site, sub.agencysite.com/store ranking for my terms while the new site is not showing up. So I am left with this question: If I want to get the new site ranking what is the best methodology? I am thinking of doing a 1:1 mapping of all pages and set up 302 redirects from the old to the new and then making the canonical tags on the old to reflect the new. My only thing here is how will Google actually view this setup? I mean on one hand I am saying
Intermediate & Advanced SEO | | DRSearchEngOpt
"Hey, Googs, this is just a temp thing." and on the other I am saying "Hey, Googs, give all the weight to this page, got it? Graci!" So with my limited abilities, can anybody provide me a best case scenario?0 -
Duplicate Content Issue
Why do URL with .html or index.php at the end are annoying to the search engine? I heard it can create some duplicate content but I have no idea why? Could someone explain me why is that so? Thank you
Intermediate & Advanced SEO | | Ideas-Money-Art0 -
Does a mobile site count as duplicate content?
Are there any specific guidelines that should be followed for setting up a mobile site to ensure it isn't counted as duplicate content?
Intermediate & Advanced SEO | | nicole.healthline0