Caps in URL creating duplicate content
-
Im getting a bunch of duplicate content errors where the crawl is saying
www.url.com/abc has duplicate at www.url.com/ABC
The content is in magento and the url settings are lowercase, and I cant figure out why it thinks there is duplicate consent. These are pages with a decent number of inbound links.
-
I checked and it is a magento feature to rewrite caps to lower case.
I added this to htaccess anyway
<code>RewriteMap lc int:tolower RewriteCond %{REQUEST_URI} [A-Z] RewriteRule (.*) ${lc:$1} [R=301,L]</code>
One last question before I take this question to a magento forum - how can I look at a page with a caps URL and lower URL and see if they are really different pages or link to the same address.
When you change random letters to caps in our site it sends you to the right page but my browser still shows the mixed caps url instead of replacing with an all lower url - but is that really a different page or is the browser just not changing the caps display when it is really getting the lower case page ```
-
Hi John,
I checked the URL you sent me. You do have duplicate pages:
http://www.madebysurvivors.com/destiny
http://www.madebysurvivors.com/DESTINY
both work and return the same page..
I also tried clicking on other links on your site, and then just changing a few letters to the upper case something like this
http://www.madebysurvivors.com/LEArn-human-trafficking-slavery
and it returns the same page
From what I can tell its one of the features in Magento that is making this possible. I would go into settings and disable that setting that forces Magento to use lower case.
Then test it make sure that you DO get a 404 page if you change the letter case on any of your links. Once you test it and you do get a 404 page.
I'm not familiar with Magento so not sure if it has that option or not, but many CMS and ecommerce platforms have a field where you can specify the URL for that page, I would change that field to all lower case.
Test it again, if it works there is one more step that you have to do if you want to keep the same juice from the pages that had the uppercase URL.
You need to duplicate your pages, but you need to make sure that the URL address is the same as it was before (in all CAPS) and then do a 301 redirect to the new page which is in lower case.
Hope this helps and makes sense.
-
This is intended functionality in Magento. It's supposed to help the user experience, as a user can navigate to a page even if they aren't sure on the casing of the words.
Of course that's bad for SEO. You'll need to put in the concept of canonicalization. Here's a free extension by Yoast:
http://www.magentocommerce.com/magento-connect/canonical-url-for-magento.html
Cheers.
Update: seeing your response, your solution of putting in redirects wouldn't be possible. You'd have to cover all combinations of caps/non-caps, and well, that's more work than you should want :). As for why this happens, the uppercase character is being lowercased when checking if something in the database matches the URL. Again, this is intended functionality.
-
Looks like I do need some more help.
I get a redirect loop if I enter a redirect from
http://www.madebysurvivors.com/DESTINY
to
http://www.madebysurvivors.com/destiny
but I checked and there is no redirect the other way in our database or htaccess.
If I leave the redirect off I get duplicate content - but in the CMS parts of magento there is only one table for this page.
-
I actually moved all the content from a drupal install so I dont have that many URLs that have the problem. It looks like the faster way to do this is just redirects the caps to lower case as thats what we use elsewhere..
I dug into the underlying database and cant find any duplicate entries for these pages or odd redirects so I have no idea of the cause.
For some of the pages I think you are right that magento is moving caps down to lower, but there are a few others where it is lower to caps - but it was caps in the drupal site.
Anyway -good to know google sees them differently so Ill put in redirects. Its only about 20 pages
-
Hello John,
If you can provide us with a URL we might be able to dig in to see what is going on. Without it its almost impossible to tell. Also it doesn't matter if you have a decent number of inbound links, duplicate content only refers to pages with similar content. I'm not familiar with Magento platform so this is just a guess, when you created (or imported) pages or categories in Magento originally were they lowercased? If not its possible that Magento added them as all in CAPS and Magento might be forcing it to lower case, therefore you might have duplicates, but once again this is just a guess and without a URL to your site I doubt that someone will be able to help you further.
-
www.url.com/abc and www.url.com/ABC are two completely different pages according to Google
I would redirect any and all pages with capitals to the corresponding lower case URL's.
Dont worry about the link juice as it will pass over via the redirect. It will also be much better than having 2 identical pages competing with eachother (according to Google)
Greg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content through product variants
Hi, Before you shout at me for not searching - I did and there are indeed lots of threads and articles on this problem. I therefore realise that this problem is not exactly new or unique. The situation: I am dealing with a website that has 1 to N (n being between 1 and 6 so far) variants of a product. There are no dropdown for variants. This is not technically possible short of a complete redesign which is not on the table right now. The product variants are also not linked to each other but share about 99% of content (obvious problem here). In the "search all" they show up individually. Each product-variant is a different page, unconnected in backend as well as frontend. The system is quite limited in what can be added and entered - I may have some opportunity to influence on smaller things such as enabling canonicals. In my opinion, the optimal choice would be to retain one page for each product, the base variant, and then add dropdowns to select extras/other variants. As that is not possible, I feel that the best solution is to canonicalise all versions to one version (either base variant or best-selling product?) and to offer customers a list at each product giving him a direct path to the other variants of the product. I'd be thankful for opinions, advice or showing completely new approaches I have not even thought of! Kind Regards, Nico
Technical SEO | | netzkern_AG0 -
When is Duplicate Content Duplicate Content
Hi, I was wondering exactly when duplicate content is duplicate content? Is it always when it is word-for-word or if it is similar? For example, we currently have an information page and I would like to add a FAQ to the website. There is, however, a crossover with the content and some of it is repeated. However, it is not written word for word. Could you please advise me? Thanks a lot Tom
Technical SEO | | National-Homebuyers0 -
Content relaunch without content duplication
We write great Content for blog and websites (or at least we try), especially blogs. Sometimes few of them may NOT get good responses/reach. It could be the content which is not interesting, or the title, or bad timing or even the language used. My question for the discussion is, what will you do if you find the content worth audience's attention missed it during its original launch. Is that fine to make the text and context better and relaunch it ? For example: 1. Rechristening the blog - Change Title to make it attractive
Technical SEO | | macronimous
2. Add images
3. Check spelling
4. Do necessary rewrite, spell check
5. Change the timeline by adding more recent statistics, references to recent writeups (external and internal blogs for example), change anything that seems outdated Also, change title and set rel=cannoical / 301 permanent URLs. Will the above make the blog new? Any ideas and tips to do? Basically we like to refurbish (:-)) content that didn't succeed in the past and relaunch it to try again. If we do so will there be any issues with Google bots? (I hope redirection would solve this, But still I want to make sure) Thanks,0 -
Duplicate content on report
Hi, I just had my Moz Campaign scan 10K pages out of which 2K were duplicate content and URL's are http://www.Somesite.com/modal/register?destination=question%2F37201 http://www.Somesite.com/modal/register?destination=question%2F37490 And the title for all 2K is "Register" How can i deal with this as all my pages have the register link and login and when done it comes back to the same page where we left and that it actually not duplicate but we need to deal with it propely thanks
Technical SEO | | mtthompsons0 -
Is Noindex Enough To Solve My Duplicate Content Issue?
Hello SEO Gurus! I have a client who runs 7 web properties. 6 of them are satellite websites, and 7th is his company's main website. For a long while, my company has, among other things, blogged on a hosted blog at www.hismainwebsite.com/blog, and when we were optimizing for one of the other satellite websites, we would simply link to it in the article. Now, however, the client has gone ahead and set up separate blogs on every one of the satellite websites as well, and he has a nifty plug-in set up on the main website's blog that pipes in articles that we write to their corresponding satellite blog as well. My concern is duplicate content. In a sense, this is like autoblogging -- the only thing that doesn't make it heinous is that the client is autoblogging himself. He thinks that it will be a great feature for giving users to his satellite websites some great fresh content to read -- which I agree, as I think the combination of publishing and e-commerce is a thing of the future -- but I really want to avoid the duplicate content issue and a possible SEO/SERP hit. I am thinking that a noindexing of each of the satellite websites' blog pages might suffice. But I'd like to hear from all of you if you think that even this may not be a foolproof solution. Thanks in advance! Kind Regards, Mike
Technical SEO | | RCNOnlineMarketing0 -
Need help with Joomla duplicate content issues
One of my campaigns is for a Joomla site (http://genesisstudios.com) and when my full crawl was done and I review the report, I have significant duplicate content issues. They seem to come from the automatic creation of /rss pages. For example: http://www.genesisstudios.com/loose is the page but the duplicate content shows up as http://www.genesisstudios.com/loose/rss It appears that Joomla creates feeds for every page automatically and I'm not sure how to address the problem they create. I have been chasing down duplicate content issues for some time and thought they were gone, but now I have about 40 more instances of this type. It also appears that even though there is a canonicalization plugin present and enabled, the crawl report shows 'false' for and rel= canonicalization tags Anyone got any ideas? Thanks so much... Scott | |
Technical SEO | | sdennison0 -
Duplicate content, how to solve?
I have about 400 errors about duplicate content on my seomoz dashboard. However I have no idea how to solve this, I have 2 main scenarios of duplication in my site: Scenario 1: http://www.theprinterdepo.com/catalogsearch/advanced/result/?name=64MB+SDRAM+DIMM+MEMORY+MODULE&sku=&price%5Bfrom%5D=&price%5Bto%5D=&category= 3 products with the same title, but different product models, as you can note is has the same price as well. Some printers use a different memory product module. So I just cant delete 2 products. Scenario 2: toners http://www.theprinterdepo.com/brother-high-capacity-black-toner-cartridge-compatible-73 http://www.theprinterdepo.com/brother-high-capacity-black-toner-cartridge-compatible-75 In this scenario, products have a different title but the same price. Again, in this scenario the 2 products are different. Thank you
Technical SEO | | levalencia10 -
Duplicate Content Resolution Suggestion?
SEOmoz tools is saying there is duplicate content for: www.mydomain.com www.mydomain.com/index.html What would be the best way to resolve this "error"?
Technical SEO | | PlasticCards0