URL Capitalization Inconsistencies Registering Duplicate Content Crawl Errors
-
Hello,
I have a very large website that has a good amount of "Duplicate Content" issues according to MOZ. In reality though, it is not a problem with duplicate content, but rather a problem with URLs. For example: http://acme.com/product/features and http://acme.com/Product/Features both land on the same page, but MOZ is seeing them as separate pages, therefor assuming they are duplicates.
We have recently implemented a solution to automatically de-captialize all characters in the URL, so when you type acme.com/Products, the URL will automatically change to acme.com/products – but MOZ continues to flag multiple "Duplicate Content" issues. I noticed that many of the links on the website still have the uppercase letters in the URL even though when clicked, the URL changes to all lower case. Could this be causing the issue?
What is the best way to remove the "Duplicate Content" issues that are not actually duplicate content?
-
http://moz.com/learn/seo/canonicalization
"Another option for dealing with duplicate content is to utilize the rel=canonical tag. The rel=canonical tag passes the same amount of link juice (ranking power) as a 301 redirect, and often takes much less development time to implement."
-
If you check Google Analytics, GA is probably seeing it too. We had a similar problem. Canonicalization will help with duplicate content, but it won't help with rankings. Internally, you are sending link juice to multiple versions of the same page. In addition, you could have backlinks pointing at multiple duplicate pages, and splitting the link love.
Canonicalization does not transfer link juice the way a 301 Redirect does. All the canonical tag does is tell Google "Rank This Page". If you don't care about rankings the canonical is fine. If you do care, you need to 301 all of your pages to the lower case version.
If you decide to 301, first, build an HTML sitemap with all of the uppercase URLs. After you do the 301, have Google fetch the sitemap and submit it, This will help Googlebot wind all of the pages that were 301ed.
-
Hey man. If your store is a Magento store, there are settings for adding canonicalization tags to categories and products under:
System => Configuration => Catalog => Search Engine Optimization
h/t to Yoast for reminding me of the string to get there.
I am optimizing a Magento store that had a similar issue after a relaunch. Found that to be a very easy way to fix it.
Hope this helps.
-
I have a complex CMS for my main website; I just asked my developer to do it. On my Wordpress sites, I use an SEO plugin for this (Yoast).
-
Thank you so much Linda.
Do you know of a fast way to add a rel=canonical tag to all the pages, the website is quite large, and it would likely take months over months to do it manually.
-
Hi,
There is the same question here on moz: Duplicate Content and URL Capitalization
-
The best way to fix this is with a rel=canonical URL. Tag each page with the lower-case version. (I had this same problem.)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content created by website Calendar - A Penalty?
A colleague of mine asked me a question about duplicate content coming from their event calendar. I don't think this will affect them negatively, but I would love some feedback and thoughts. ThanksOne of my clients, LifeTech Academy, is using my RavenTools software. Raventools has reported a HUGE amount of duplicate content (4.4K instances).The duplicate content all revolves around their calendar and repeating events (http://lifetechacademy.org/events/)The question is this - will this impact their SEO efforts in a negative way?
Intermediate & Advanced SEO | | Bill_K0 -
Duplicate Content: Is a product feed/page rolled out across subdomains deemed duplicate content?
A company has a TLD (top-level-domain) which every single product: company.com/product/name.html The company also has subdomains (tailored to a range of products) which lists a choosen selection of the products from the TLD - sort of like a feed: subdomain.company.com/product/name.html The content on the TLD & subdomain product page are exactly the same and cannot be changed - CSS and HTML is slightly differant but the content (text and images) is exactly the same! My concern (and rightly so) is that Google will deem this to be duplicate content, therfore I'm going to have to add a rel cannonical tag into the header of all subdomain pages, pointing to the original product page on the TLD. Does this sound like the correct thing to do? Or is there a better solution? Moving on, not only are products fed onto subdomain, there are a handfull of other domains which list the products - again, the content (text and images) is exactly the same: other.com/product/name.html Would I be best placed to add a rel cannonical tag into the header of the product pages on other domains, pointing to the original product page on the actual TLD? Does rel cannonical work across domains? Would the product pages with a rel cannonical tag in the header still rank? Let me know if there is a better solution all-round!
Intermediate & Advanced SEO | | iam-sold0 -
Duplicate Content for Deep Pages
Hey guys, For deep, deep pages on a website, does duplicate content matter? The pages I'm talk about are image pages associated with products and will never rank in Google which doesn't concern me. What I'm interested to know though is whether the duplicate content would have an overall effect on the site as a whole? Thanks in advance Paul
Intermediate & Advanced SEO | | kevinliao1 -
Duplicate page content query
Hi forum, For some reason I have recently received a large increase in my Duplicate Page Content issues. Currently it says I have over 7,000 duplicate page content errors! For example it says: Sample URLs with this Duplicate Page Content http://dikelli.com.au/accessories/gowns/news.html http://dikelli.com.au/accessories/news.html
Intermediate & Advanced SEO | | sterls
http://dikelli.com.au/gallery/dikelli/gowns/gowns/sale_gowns.html However there are no physical links to any of these page on my site and even when I look at my FTP files (I am using Dreamweaver) these directories and files do not exist. Can anyone please tell me why the SEOMOZ crawl is coming up with these errors and how to solve them?0 -
Why is Google Reporting big increase in duplicate content after Canonicalization update?
Our web hosting company recently applied a update to our site that should have rectified Canonicalized URLs. Webmaster tools had been reporting duplicate content on pages that had a query string on the end. After the update there has been a massive jump in Webmaster tools reporting now over 800 pages of duplicate content, Up from about 100 prior to the update plus it reporting some very odd pages (see attached image) They claim they have implement Canonicalization in line with Google Panda & Penguin, but surely something is not right here and it's going to cause us a big problem with traffic. Can anyone shed any light on the situation??? Duplicate%20Content.jpg
Intermediate & Advanced SEO | | Towelsrus0 -
Virtual Domains and Duplicate Content
So I work for an organization that uses virtual domains. Basically, we have all our sites on one domain and then these sites can also be shown at a different URL. Example: sub.agencysite.com/store sub.brandsite.com/store Now the problem comes up often when we move the site to a brand's URL versus hosting the site on our URL, we end up with duplicate content. Now for god knows what damn reason, I currently cannot get my dev team to implement 301's but they will implement 302's. (Dont ask) I also am left with not being able to change the robots.txt file for our site. They say if we allowed people to go in a change this stuff it would be too messy and somebody would accidentally block a site that was not supposed to be blocked on our domain. (We are apparently incapable toddlers) Now I have an old site, sub.agencysite.com/store ranking for my terms while the new site is not showing up. So I am left with this question: If I want to get the new site ranking what is the best methodology? I am thinking of doing a 1:1 mapping of all pages and set up 302 redirects from the old to the new and then making the canonical tags on the old to reflect the new. My only thing here is how will Google actually view this setup? I mean on one hand I am saying
Intermediate & Advanced SEO | | DRSearchEngOpt
"Hey, Googs, this is just a temp thing." and on the other I am saying "Hey, Googs, give all the weight to this page, got it? Graci!" So with my limited abilities, can anybody provide me a best case scenario?0 -
Duplicate content via dynamic URLs where difference is only parameter order?
I have a question about the order of parameters in an URL versus duplicate content issues. The URLs would be identical if the parameter order was the same. E.g.
Intermediate & Advanced SEO | | anthematic
www.example.com/page.php?color=red&size=large&gender=male versus
www.example.com/page.php?gender=male&size=large&color=red How smart is Google at consolidating these, and do these consolidated pages incur any penalty (is their combined “weight” equal to their individual selves)? Does Google really see these two pages as DISTINCT, or does it recognize that they are the same because they have the exact same parameters? Is this worth fixing in or does it have a trivial impact? If we have to fix it and can't change our CMS, should we set a preferred, canonical order for these URLs or 301 redirect from one version to the other? Thanks a million!0 -
SEOMoz mistaking image pages as duplicate content
I'm getting duplicate content errors, but it's for pages with high-res images on them. Each page has a different, high-res image on it. But SEOMoz keeps telling me it's duplicate content, even though the images are different (and named different). Is this something I can ignore or will Google see it the same way too?
Intermediate & Advanced SEO | | JHT0