URL Capitalization Inconsistencies Registering Duplicate Content Crawl Errors
-
Hello,
I have a very large website that has a good amount of "Duplicate Content" issues according to MOZ. In reality though, it is not a problem with duplicate content, but rather a problem with URLs. For example: http://acme.com/product/features and http://acme.com/Product/Features both land on the same page, but MOZ is seeing them as separate pages, therefor assuming they are duplicates.
We have recently implemented a solution to automatically de-captialize all characters in the URL, so when you type acme.com/Products, the URL will automatically change to acme.com/products – but MOZ continues to flag multiple "Duplicate Content" issues. I noticed that many of the links on the website still have the uppercase letters in the URL even though when clicked, the URL changes to all lower case. Could this be causing the issue?
What is the best way to remove the "Duplicate Content" issues that are not actually duplicate content?
-
http://moz.com/learn/seo/canonicalization
"Another option for dealing with duplicate content is to utilize the rel=canonical tag. The rel=canonical tag passes the same amount of link juice (ranking power) as a 301 redirect, and often takes much less development time to implement."
-
If you check Google Analytics, GA is probably seeing it too. We had a similar problem. Canonicalization will help with duplicate content, but it won't help with rankings. Internally, you are sending link juice to multiple versions of the same page. In addition, you could have backlinks pointing at multiple duplicate pages, and splitting the link love.
Canonicalization does not transfer link juice the way a 301 Redirect does. All the canonical tag does is tell Google "Rank This Page". If you don't care about rankings the canonical is fine. If you do care, you need to 301 all of your pages to the lower case version.
If you decide to 301, first, build an HTML sitemap with all of the uppercase URLs. After you do the 301, have Google fetch the sitemap and submit it, This will help Googlebot wind all of the pages that were 301ed.
-
Hey man. If your store is a Magento store, there are settings for adding canonicalization tags to categories and products under:
System => Configuration => Catalog => Search Engine Optimization
h/t to Yoast for reminding me of the string to get there.
I am optimizing a Magento store that had a similar issue after a relaunch. Found that to be a very easy way to fix it.
Hope this helps.
-
I have a complex CMS for my main website; I just asked my developer to do it. On my Wordpress sites, I use an SEO plugin for this (Yoast).
-
Thank you so much Linda.
Do you know of a fast way to add a rel=canonical tag to all the pages, the website is quite large, and it would likely take months over months to do it manually.
-
Hi,
There is the same question here on moz: Duplicate Content and URL Capitalization
-
The best way to fix this is with a rel=canonical URL. Tag each page with the lower-case version. (I had this same problem.)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to solve this issue and avoid duplicated content?
My marketing team would like to serve up 3 pages of similar content; www.example.com/one, www.example.com/two and www.example.com/three; however the challenge here is, they'd like to have only one page whith three different titles and images based on the user's entry point (one, two, or three). To avoid duplicated pages, how would suggest this best be handled?
Intermediate & Advanced SEO | | JoelHer0 -
Duplicate page url crawl report
Details: Hello. Looking at the duplicate page url report that comes out of Moz, is the best tactic to a) use 301 redirects, and b) should the url that's flagged for duplicate page content be pointed to the referring url? Not sure where the 301 redirect should be applied... should this url, for example: <colgroup><col width="452"></colgroup>
Intermediate & Advanced SEO | | compassseo
| http://newgreenair.com/website/blog/ | which is listed in the first column of the Duplicate Page Content crawl, be pointed to referring url in the same spreadsheet? Or, what's the best way to apply the 301 redirect? thanks!0 -
Duplicate URL Parameters for Blog Articles
Hi there, I'm working on a site which is using parameter URLs for category pages that list blog articles. The content on these pages constantly change as new posts are frequently added, the category maybe for 'Heath Articles' and list 10 blog posts (snippets from the blog). The URL could appear like so with filtering: www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016 www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=1 All pages currently have the same Meta title and descriptions due to limitations with the CMS, they are also not in our xml sitemap I don't believe we should be focusing on ranking for these pages as the content on here are from blog posts (which we do want to rank for on the individual post) but there are 3000 duplicates and they need to be fixed. Below are the options we have so far: Canonical URLs Have all parameter pages within the category canonicalize to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general and generate dynamic page titles (I know its a good idea to use parameter pages in canonical URLs). WMT Parameter tool Tell Google all extra parameter tags belong to the main pages (e.g. www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=3 belongs to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general). Noindex Remove all the blog category pages, I don't know how Google would react if we were to remove 3000 pages from our index (we have roughly 1700 unique pages) We are very limited with what we can do to these pages, if anyone has any feedback suggestions it would be much appreciated. Thanks!
Intermediate & Advanced SEO | | Xtend-Life0 -
Thin Content to Quality Content
How should i modify content from thin to high quality content. Somehow i realized that my pages where targetted keywords didn't had the keyword density lost a massive ranking after the last update whereas all pages which had the keyword density are ranking good. But my concern is all pages which are ranking good had all the keyword in a single statement like. Get ABC pens, ABC pencils, ABC colors, etc. at the end of a 300 word content describing ABC. Whereas the pages which dropped the rankings had a single keyword repeated just twice in a 500 word article. Can this be the reason for a massive drop. Should i add the single statement like the one which is there on pages ranking good? Is it good to add just a single line once the page is indexed or do i need to get a fresh content once again along with a sentence of keyword i mentioned above?
Intermediate & Advanced SEO | | welcomecure1 -
Removing duplicate content
Due to URL changes and parameters on our ecommerce sites, we have a massive amount of duplicate pages indexed by google, sometimes up to 5 duplicate pages with different URLs. 1. We've instituted canonical tags site wide. 2. We are using the parameters function in Webmaster Tools. 3. We are using 301 redirects on all of the obsolete URLs 4. I have had many of the pages fetched so that Google can see and index the 301s and canonicals. 5. I created HTML sitemaps with the duplicate URLs, and had Google fetch and index the sitemap so that the dupes would get crawled and deindexed. None of these seems to be terribly effective. Google is indexing pages with parameters in spite of the parameter (clicksource) being called out in GWT. Pages with obsolete URLs are indexed in spite of them having 301 redirects. Google also appears to be ignoring many of our canonical tags as well, despite the pages being identical. Any ideas on how to clean up the mess?
Intermediate & Advanced SEO | | AMHC0 -
Potential Pagination Issue/ Duplicate content issue
Hi All, We upgraded our framework , relaunched our site with new url structures etc and re did our site map to Google last week. However, it's now come to light that the rel=next, rel=Prev tags we had in place on many of our pages are missing. We are putting them back in now but my worry is , as they were previously missing when we submitted the , will I have duplicate content issues or will it resolve itself , as Google re-crawls the site over time ?.. Any advice would be greatly appreciated? thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
Multiply domains and duplicate content confusion
I've just found out that a client has multiple domains which are being indexed by google and so leading me to worry that they will be penalised for duplicate content. Wondered if anyone could confirm a) are we likely to be penalised? and b) what should we do about it? (i'm thinking just 301 redirect each domain to the main www.clientdomain.com...?). Actual domain = www.clientdomain.com But these also exist: www.hostmastr.clientdomain.com www.pop.clientdomain.com www.subscribers.clientdomain.com www.www2.clientdomain.com www.wwwww.clientdomain.com ps I have NO idea how/why all these domains exist I really appreciate any expertise on this issue, many thanks!
Intermediate & Advanced SEO | | bisibee10 -
Canonical Not Fixing Duplicate Content
I added a canonical tag to the home page last month, but I am still showing duplicate content for the home page. Here is the tag I added: What am I missing? Duplicate-Content.jpg
Intermediate & Advanced SEO | | InnoInsulation0