Capitals in url creates duplicate content?
-
Hey Guys,
I had a quick look around however I couldn't find a specific answer to this.
Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it.
However, a heap of those errors are relating to random capitals in the urls.
for example.
"www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals).
Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation?
Any help is greatly appreciated.
LM.
-
The IIS url-rewrite addon works great!
-
From my memory Google does treat urls as case sensitive.
Best to keep al urls as lower case.
-
Thanks for your reply Alan!
Bing is irrelevant in Belgium
Maybe marketshare of 0,00005 or so
When I look at the SEOMoz crawling reports I panic, but when I look at GWT, I'm happy... The difference is huge.
So, no sure I will keep on using these reports..
-
I don't know that Google does ignore it. anyhow Bing does not http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
-
If Google ignores the mixed usage of capitals in URL's, then why is the SEOMoz reporting it? If it is irrelevant, why not leaving it out?? It takes quite some work to filter out the irrelevant stuff!
-
Thanks Semil - The same duplicates are not showing in Google Webmaster Tools, for instance SEOMoz is showing 639 duplicate page content and 646 duplicate page titles. Webmaster tools is 88 and 37 respectively.
Looking into the numbers in SEOmoz again (and they've risen since the original post) there's a huge number which fall under the capitalisation discussed but also some which seem to register as HTTPS and HTTP.
-
Thanks Alan - I'll get on this...
-
Yes its seen as too different urls
http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
If you are uisng a windows server (IIS), you can fix this easy by using the IIS url-rewrite addon. it had a rewite as lowercase preset
-
Google does count this as duplicate content. Semil is right. You want to have someone do url rewrites on the server side to 301 these to lowercase.
-
Hi LucasM,
Yes its possible by server side that you cant open a url with capital letters if you are using small letters.
But I dont think google will talke capitalisation in consideration.
Is it showing you in Google webmaster tool in duplicate titles and duplicate descriptions ?
If its showing then ask your coder to play with .htaccess to stop opening a url with different small - capital letter combination.
Thanks,
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate URLs on eCommerce site caused by parameters
Hi there, We have a client with a large eCommerce site with about 1500 duplicate URLs caused by the parameters in the URLs (such as the sort parameter where the list of products are then sorted by price, age etc.) Example: www.example.com/cars/toyota First duplicate URL: www.example.com/cars/toyota?sort=price-ascending Second duplicate URL: www.example.com/cars/toyota?sort=price-descending Third duplicate URL: www.example.com/cars/toyota?sort=age-descending Originally we had advised to add a robots.txt file to block search engines from crawling the URLs with parameters but this hasn't been done. My question: If we add the robots.txt now and exclude all URLs with filters - how long will it take for Google to disregard the duplicate URLs? We could ask the developers to add canonical tags to all the duplicates but these are about 1500... Thanks in advance for any advice!
Intermediate & Advanced SEO | | Gabriele_Layoutweb0 -
301 redirect to avoid duplicate content penalty
I have two websites with identical content. Haya and ethnic Both websites have similar products. I would like to get rid of ethniccode I have already started to de-index ethniccode. My question is, Will I get any SEO benefit or Will it be harmful if I 301 direct the below only URL’s https://www.ethniccode/salwar-kameez -> https://www.hayacreations/collections/salwar-kameez https://www.ethniccode/salwar-kameez/anarkali-suits - > https://www.hayacreations/collections/anarkali-suits
Intermediate & Advanced SEO | | riyaaaz0 -
Duplicate Content through 'Gclid'
Hello, We've had the known problem of duplicate content through the gclid parameter caused by Google Adwords. As per Google's recommendation - we added the canonical tag to every page on our site so when the bot came to each page they would go 'Ah-ha, this is the original page'. We also added the paramter to the URL parameters in Google Wemaster Tools. However, now it seems as though a canonical is automatically been given to these newly created gclid pages; below https://www.google.com.au/search?espv=2&q=site%3Awww.mypetwarehouse.com.au+inurl%3Agclid&oq=site%3A&gs_l=serp.3.0.35i39l2j0i67l4j0i10j0i67j0j0i131.58677.61871.0.63823.11.8.3.0.0.0.208.930.0j3j2.5.0....0...1c.1.64.serp..8.3.419.nUJod6dYZmI Therefore these new pages are now being indexed, causing duplicate content. Does anyone have any idea about what to do in this situation? Thanks, Stephen.
Intermediate & Advanced SEO | | MyPetWarehouse0 -
Magento products and eBay - duplicate content risk?
Hi, We are selling about 1000 sticker products in our online store and would like to expand a large part of our products lineup to eBay as well. There are pretty good modules for this as I've heard. I'm just wondering if there will be duplicate content problems if I sync the products between Magento and eBay and they get uploaded to eBay with identical titles, descriptions and images? What's the workaround in this case? Thanks!
Intermediate & Advanced SEO | | speedbird12290 -
Best to Fix Duplicate Content Issues on Blog If URLs are Set to "No-Index"
Greetings Moz Community: I purchased a SEMrush subscription recently and used it to run a site audit. The audit detected 168 duplicate content issues mostly relating to blog posts tags. I suspect these issues may be due to canonical tags not being set up correctly. My developer claims that since these blog URLs are set to "no-index" these issues do not need to be corrected. My instinct would be to avoid any risk with potential duplicate content. To set up canonicalization correctly. In addition, even if these pages are set to "no-index" they are passing page rank. Further more I don't know why a reputable company like SEMrush would consider these errors if in fact they are not errors. So my question is, do we need to do anything with the error pages if they are already set to "no-index"? Incidentally the site URL is www.nyc-officespace-leader.com. I am attaching a copy of the SEMrush audit. Thanks, Alan BarjWaO SqVXYMy
Intermediate & Advanced SEO | | Kingalan10 -
Ticket Industry E-commerce Duplicate Content Question
Hey everyone, How goes it? I've got a bunch of duplicate content issues flagged in my Moz report and I can't figure out why. We're a ticketing site and the pages that are causing the duplicate content are for events that we no longer offer tickets to, but that we will eventually offer tickets to again. Check these examples out: http://www.charged.fm/mlb-all-star-game-tickets http://www.charged.fm/fiba-world-championship-tickets I realize the content is thin and that these pages basically the same, but I understood that since the Title tags are different that they shouldn't appear to the Goog as duplicate content. Could anyone offer me some insight or solutions to this? Should they be noindexed while the events aren't active? Thanks
Intermediate & Advanced SEO | | keL.A.xT.o1 -
Proper Hosting Setup to Avoid Subfolders & Duplicate Content
I've noticed with hosting multiple websites on a single account you end up having your main site in the root public_html folder, but when you create subfolders for new website it actually creates a duplicate website: eg. http://kohnmeat.com/ is being hosted on laubeau.com's server. So you end up with a duplicate website: http://laubeau.com/kohn/ Anyone know the best way to prevent this from happening? (i.e. canonical? 301? robots.txt?) Also, maybe a specific 'how-to' if you're feeling generous 🙂
Intermediate & Advanced SEO | | ATMOSMarketing560 -
Duplicate Content across 4 domains
I am working on a new project where the client has 5 domains each with identical website content. There is no rel=canonical. There is a great variation in the number of pages in the index for each of the domains (from 1 to 1250). OSE shows a range of linking domains from 1 to 120 for each domain. I will be strongly recommending to the client to focus on one website and 301 everything from the other domains. I would recommend focusing on the domain that has the most pages indexed and the most referring domains but I've noticed the client has started using one of the other domains in their offline promotional activity and it is now their preferred domain. What are your thoughts on this situation? Would it be better to 301 to the client's preferred domain (and lose a level of ranking power throught the 301 reduction factor + wait for other pages to get indexed) or stick with the highest ranking/most linked domain even though it doesn't match the client's preferred domain used for email addresses etc. Or would it better to use cross-domain canoncial tags? Thanks
Intermediate & Advanced SEO | | bjalc20110