Caps in URL creating duplicate content
-
Im getting a bunch of duplicate content errors where the crawl is saying
www.url.com/abc has duplicate at www.url.com/ABC
The content is in magento and the url settings are lowercase, and I cant figure out why it thinks there is duplicate consent. These are pages with a decent number of inbound links.
-
I checked and it is a magento feature to rewrite caps to lower case.
I added this to htaccess anyway
<code>RewriteMap lc int:tolower RewriteCond %{REQUEST_URI} [A-Z] RewriteRule (.*) ${lc:$1} [R=301,L]</code>
One last question before I take this question to a magento forum - how can I look at a page with a caps URL and lower URL and see if they are really different pages or link to the same address.
When you change random letters to caps in our site it sends you to the right page but my browser still shows the mixed caps url instead of replacing with an all lower url - but is that really a different page or is the browser just not changing the caps display when it is really getting the lower case page ```
-
Hi John,
I checked the URL you sent me. You do have duplicate pages:
http://www.madebysurvivors.com/destiny
http://www.madebysurvivors.com/DESTINY
both work and return the same page..
I also tried clicking on other links on your site, and then just changing a few letters to the upper case something like this
http://www.madebysurvivors.com/LEArn-human-trafficking-slavery
and it returns the same page
From what I can tell its one of the features in Magento that is making this possible. I would go into settings and disable that setting that forces Magento to use lower case.
Then test it make sure that you DO get a 404 page if you change the letter case on any of your links. Once you test it and you do get a 404 page.
I'm not familiar with Magento so not sure if it has that option or not, but many CMS and ecommerce platforms have a field where you can specify the URL for that page, I would change that field to all lower case.
Test it again, if it works there is one more step that you have to do if you want to keep the same juice from the pages that had the uppercase URL.
You need to duplicate your pages, but you need to make sure that the URL address is the same as it was before (in all CAPS) and then do a 301 redirect to the new page which is in lower case.
Hope this helps and makes sense.
-
This is intended functionality in Magento. It's supposed to help the user experience, as a user can navigate to a page even if they aren't sure on the casing of the words.
Of course that's bad for SEO. You'll need to put in the concept of canonicalization. Here's a free extension by Yoast:
http://www.magentocommerce.com/magento-connect/canonical-url-for-magento.html
Cheers.
Update: seeing your response, your solution of putting in redirects wouldn't be possible. You'd have to cover all combinations of caps/non-caps, and well, that's more work than you should want :). As for why this happens, the uppercase character is being lowercased when checking if something in the database matches the URL. Again, this is intended functionality.
-
Looks like I do need some more help.
I get a redirect loop if I enter a redirect from
http://www.madebysurvivors.com/DESTINY
to
http://www.madebysurvivors.com/destiny
but I checked and there is no redirect the other way in our database or htaccess.
If I leave the redirect off I get duplicate content - but in the CMS parts of magento there is only one table for this page.
-
I actually moved all the content from a drupal install so I dont have that many URLs that have the problem. It looks like the faster way to do this is just redirects the caps to lower case as thats what we use elsewhere..
I dug into the underlying database and cant find any duplicate entries for these pages or odd redirects so I have no idea of the cause.
For some of the pages I think you are right that magento is moving caps down to lower, but there are a few others where it is lower to caps - but it was caps in the drupal site.
Anyway -good to know google sees them differently so Ill put in redirects. Its only about 20 pages
-
Hello John,
If you can provide us with a URL we might be able to dig in to see what is going on. Without it its almost impossible to tell. Also it doesn't matter if you have a decent number of inbound links, duplicate content only refers to pages with similar content. I'm not familiar with Magento platform so this is just a guess, when you created (or imported) pages or categories in Magento originally were they lowercased? If not its possible that Magento added them as all in CAPS and Magento might be forcing it to lower case, therefore you might have duplicates, but once again this is just a guess and without a URL to your site I doubt that someone will be able to help you further.
-
www.url.com/abc and www.url.com/ABC are two completely different pages according to Google
I would redirect any and all pages with capitals to the corresponding lower case URL's.
Dont worry about the link juice as it will pass over via the redirect. It will also be much better than having 2 identical pages competing with eachother (according to Google)
Greg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Home page duplicate content...
Hello all! I've just downloaded my first Moz crawl CSV and I noticed that the home page appears twice - one with an appending forward slash at the end: http://www.example.com
Technical SEO | | LiamMcArthur
http://www.example.com/ For any of my product and category pages that encounter this problem - it's automatically resolved with a canonical tag. Should I create the same canonical tag for my home page? rel="canonical" href="http://www.example.com" />0 -
Duplicate Content?
My site has been archiving our newsletters since 2001. It's been helpful because our site visitors can search a database for ideas from those newsletters. (There are hundreds of pages with similar titles: archive1-Jan2000, archive2-feb2000, archive3-mar2000, etc.) But, I see they are being marked as "similar content." Even though the actual page content is not the same. Could this adversely affect SEO? And if so, how can I correct it? Would a separate folder of archived pages with a "nofollow robot" solve this issue? And would my site visitors still be able to search within the site with a nofollow robot?
Technical SEO | | sakeith0 -
SEOMOZ and non-duplicate duplicate content
Hi all, Looking through the lovely SEOMOZ report, by far its biggest complaint is that of perceived duplicate content. Its hard to avoid given the nature of eCommerce sites that oestensibly list products in a consistent framework. Most advice about duplicate content is about canonicalisation, but thats not really relevant when you have two different products being perceived as the same. Thing is, I might have ignored it but google ignores about 40% of our site map for I suspect the same reason. Basically I dont want us to appear "Spammy". Actually we do go to a lot of time to photograph and put a little flavour text for each product (in progress). I guess my question is, that given over 700 products, why 300ish of them would be considered duplicates and the remaning not? Here is a URL and one of its "duplicates" according to the SEOMOZ report: http://www.1010direct.com/DGV-DD1165-970-53/details.aspx
Technical SEO | | fretts
http://www.1010direct.com/TDV-019-GOLD-50/details.aspx Thanks for any help people0 -
Determining where duplicate content comes from...
I am getting duplicate content warnings on the SEOMOZ crawl. I don't know where the content is duplicated. Is there a site that will find duplicate content?
Technical SEO | | JML11790 -
Category URL Duplicate Content
I've recently been hired as the web developer for a company with an existing web site. Their web architecture includes category names in product urls, and of course we have many products in multiple categories thus generating duplicate content. According to the SEOMoz Site Crawl, we have roughly 1600 pages of duplicate content, I expect primarily from this issue. This is out of roughly 3600 pages crawled. My questions are: 1. Fixing this for the long term will obviously mean restructuring the URLs for the site. Is this worthwhile and what will the ramifications be of performing such a move? 2. How can I determine the level and extent of the effects of this duplicated content? 3. Is it possible the best course of action is to do nothing? The site has many, many other issues, and I'm not sure how highly to prioritize this problem. In addition, the IT man is highly doubtful this is causing an SEO issue, and I'm going to need to be able to back up any action I request. I do feel I will need to strongly justify any possible risks this level of site change could cause. Thanks in advance, and please let me know if any more information is needed.
Technical SEO | | MagnetsUSA0 -
Tags causing Duplicate page content?
I was looking through the 'Duplicate Page Content' and Too Many On-Page Link' errors and they all seem to be linked to the 'Tags' on my blog pages. Is this really a problem and if so how should I be using tags properly to get the best SEO rewards?
Technical SEO | | zapprabbit1 -
Crawl Errors and Duplicate Content
SEOmoz's crawl tool is telling me that I have duplicate content at "www.mydomain.com/pricing" and at "www.mydomain.com/pricing.aspx". Do you think this is just a glitch in the crawl tool (because obviously these two URL's are the same page rather than two separate ones) or do you think this is actually an error I need to worry about? Is so, how do I fix it?
Technical SEO | | MyNet0 -
Duplicate Content Home Page
Hello, I am getting Duplicate Content warning from SEOMoz for my home page: http://www.teacherprose.com http://www.teacherprose.com/index html I tried code below in .htaccess: redirect 301 /index.html http://www.teacherprose.com This caused error "too many re-directs" in browser Any thoughts? Thank You, Eric
Technical SEO | | monthelie10