Why is Google Reporting big increase in duplicate content after Canonicalization update?
-
Our web hosting company recently applied a update to our site that should have rectified Canonicalized URLs. Webmaster tools had been reporting duplicate content on pages that had a query string on the end.
After the update there has been a massive jump in Webmaster tools reporting now over 800 pages of duplicate content, Up from about 100 prior to the update plus it reporting some very odd pages (see attached image)
They claim they have implement Canonicalization in line with Google Panda & Penguin, but surely something is not right here and it's going to cause us a big problem with traffic.
Can anyone shed any light on the situation???
-
Hi All,
I finally got to the bottom of the problem and it is that they have not applied canonicalization across the site, only to certain pages which is not my understanding when they implemented the update a few weeks back.
So they are preparing a hot fix as part of a service pack to our site which will rectify this issue and apply canonicalization to all pages that contain query strings. This should clear that problem up once and for all.
Thank you both for your input, a great help.
-
Hi Deb... I have nice blogpost from seomoz blog for you written by Lindsey in which she has explained it very nicely about it.
http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions
In this post check the example of digg.com. Digg.com has blocked "submit" in robots.txt but still Google has indexed URLs. Check screenshot in the Blog post. Hope this help.
-
_Those URLs will be crawled by Google, but will not be Indexed. And that being said, there will be no more duplicate content issue. I hope I have made myself clear over here. _
-
Deb, even if you block those URLs in Robots.txt, Google will going to index those URLs because those URLs are interlink with website. The best way is to put canonical tag so that you will get inter linking benefits as well.
-
Fraser,
Till now they have not implemented Canonicalization in your website. After Canonicalization implementation also you will duplication errors in your webmaster account but it will not harm your ranking. Because Canonicalization helps Google in selecting the page from multiple version of similar page that has to displayed in SERP. In above example, First URL is the original URL but the second URL has some parameters in URLs so your preferred version of URL should be first one. After proper Canonicalization implementation you will only see URLs that you have submitted in your sitemap via Google Webmaster Tool.
And about two webmaster codes, I don't think we have setup two separate accounts, you can provide view or admin access from your webmaster account to them.
-
Either you will have to block these pages via Google Webmaster Tools by Using URL parameter or else you need to block them via robots.txt file like this –
To block this URL: http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
You need to use this tag in robots.txt file – Disallow: /.htm?dir=
-
Hi,
Here are a couple of examples for you.
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100 ```
-
The Canonical URL updates were supposed to have been implement some weeks back.
I have asked why there are 2 webmaster tools codes, I expect this is my account plus they have one to monitor things there end.
Query string parameters have been setup, but I am unsure if they are configured correctly as this is all a bit new to me and i am in there hands to deal with this really.
The URLs without query strings are submitted to Webmaster tools via site maps and they are the URLs we want indexed.
-
Can you please share the URL and some example pages where the problem of duplicate content is appearing?
-
Hi Fraser,
Are you talking about towelsrus.co.uk ? I didn't find any canonical tag in any source page of your website. Are they sure about implementation ? or they will implement it in future. And one more interesting point, why there are two webmaster code in your website's source page. Below are those to webmaster codes:
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">BJ6cDrRRB2iS4fMx2zkZTouKTPTpECs2tw-3OAvIgh4</a>" />
<meta name="<a class="attribute-value">google-site-verification</a>" content="<a class="attribute-value">SjaHRLJh00aeQY9xJ81lorL_07UXcCDFgDFgG8lBqCk</a>" />
Have you blocked querystring parameters in "URL parameters" in Google webmaster
Tools ?
Duplication issue is showing because of below type of URLs:
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm
http://www.towelsrus.co.uk/towels/baby-towels/prodlist_ct493.htm?dir=1&size=100
No canonical tag found on above URLs as well.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search console, duplicate content and Moz
Hi, Working on a site that has duplicate content in the following manner: http://domain.com/content
Intermediate & Advanced SEO | | paulneuteboom
http://www.domain.com/content Question: would telling search console to treat one of them as the primary site also stop Moz from seeing this as duplicate content? Thanks in advance, Best, Paul. http0 -
Is This Considered Duplicate Content?
My site has entered SEO hell and I am not sure how to fix it. Up until 18 months ago I had tremendous success on Google and Bing and now my website appears below my Facebook page for the term "Direct Mail Raleigh." What makes it even more frustrating is my competitors have done no SEO and they are dominating this keyword. I thought that the issue was due to harmful inbound links and two months ago I disavowed ones that were clearly spam. Somehow my site has actually gone down! I have a blog that I have updated infrequently and I do not know if it I am getting punished for duplicate content. On Google Webmaster Tools it says I have 279 crawled and indexed pages. Yesterday when I ran the MOZ crawl check I was amazed to find 1150 different webpages on my site. Despite the fact that it does not appear on the webmaster tools I have three different webpages due to the format that the Wordpress blog was created: "http://www.marketplace-solutions.com/report/part2leadershi/", "http://www.marketplace-solutions.com/report/page/91/" and "http://www.marketplace-solutions.com/report/category/competent-leadership/page/3/" What does not make sense to me is why Google only indexed 279 webpages AND why MOZ did not identify these three webpages as duplicate content with the Crawl Test Tool. Does anyone have any ideas? Would it be as easy as creating a massive robot.txt file and just putting 2 of the 3 URLs in that file? Thank you for your help.
Intermediate & Advanced SEO | | DR700950 -
Product Syndication and duplicate content
Hi, It's a duplicate content question. We sell products (vacation rental homes) on a number of websites as well as our own. Generally, these affiliate sites have a higher domain authority and much more traffic than our site. The product content (text, images, and often availability and rates) is pulled by our affiliates into their websites daily and is exactly the same as the content on our site, not including their page structure. We receive enquiries by email and any links from their domains to ours are nofollow. For example, all of the listing text on mysite.com/listing_id is identical to my-first-affiliate-site.com/listing_id and my-second-affiliate-site.com/listing_id. Does this count as duplicate content and, if so, can anyone suggest a strategy to make the best of the situation? Thanks
Intermediate & Advanced SEO | | McCaldin0 -
Ticket Industry E-commerce Duplicate Content Question
Hey everyone, How goes it? I've got a bunch of duplicate content issues flagged in my Moz report and I can't figure out why. We're a ticketing site and the pages that are causing the duplicate content are for events that we no longer offer tickets to, but that we will eventually offer tickets to again. Check these examples out: http://www.charged.fm/mlb-all-star-game-tickets http://www.charged.fm/fiba-world-championship-tickets I realize the content is thin and that these pages basically the same, but I understood that since the Title tags are different that they shouldn't appear to the Goog as duplicate content. Could anyone offer me some insight or solutions to this? Should they be noindexed while the events aren't active? Thanks
Intermediate & Advanced SEO | | keL.A.xT.o1 -
GWT Crawl Error Report Not Updating?
GWT's crawl error report hasn't updated for me since April 25. Crawl stats are updating normally, as are robots.txt and sitemap accesses. Is anyone else experiencing this?
Intermediate & Advanced SEO | | tonyperez0 -
How can I export SEOmoz ranking reports to google spreadsheet
How can I export SEOmoz website rankings to Google Spreadsheet? I have applied other SEOmoz API's and Google Spreadsheet combos effectively but cannot find anything online for this. I would like to display current ranking and ranking history for specific keywords in Google Spreadsheet and have them update automatically using the SEOmoz API.
Intermediate & Advanced SEO | | Michael_Rock0 -
Duplicate Content issue on pages with Authority and decent SERP results
Hi, I'm not sure what the best thing to do here is. I've got quite a few duplicate page errors in my campaign. I must admit the pages were originally built just to rank a keyword variation. e.g. Main page keyword is [Widget in City] the "duplicate" page is [Black Widget in City] I guess the normal route to deal with duplicate pages is to add a canonical tag and do a 304 redirect yea? Well these pages have some page Authority and are ranking quite well for their exact keywords, what do I do?
Intermediate & Advanced SEO | | SpecialCase0 -
Best way to de-index content from Google and not Bing?
We have a large quantity of URLs that we would like to de-index from Google (we are affected b Panda), but not Bing. What is the best way to go about doing this?
Intermediate & Advanced SEO | | nicole.healthline0