Capitals in url creates duplicate content?
-
Hey Guys,
I had a quick look around however I couldn't find a specific answer to this.
Currently, the SEOmoz tools come back and show a heap of duplicate content on my site. And there's a fair bit of it.
However, a heap of those errors are relating to random capitals in the urls.
for example.
"www.website.com.au/Home/information/Stuff" is being treated as duplicate content of "www.website.com.au/home/information/stuff" (Note the difference in capitals).
Anyone have any recommendations as to how to fix this server side(keeping in mind it's not practical or possible to fix all of these links) or to tell Google to ignore the capitalisation?
Any help is greatly appreciated.
LM.
-
The IIS url-rewrite addon works great!
-
From my memory Google does treat urls as case sensitive.
Best to keep al urls as lower case.
-
Thanks for your reply Alan!
Bing is irrelevant in Belgium
Maybe marketshare of 0,00005 or so
When I look at the SEOMoz crawling reports I panic, but when I look at GWT, I'm happy... The difference is huge.
So, no sure I will keep on using these reports..
-
I don't know that Google does ignore it. anyhow Bing does not http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
-
If Google ignores the mixed usage of capitals in URL's, then why is the SEOMoz reporting it? If it is irrelevant, why not leaving it out?? It takes quite some work to filter out the irrelevant stuff!
-
Thanks Semil - The same duplicates are not showing in Google Webmaster Tools, for instance SEOMoz is showing 639 duplicate page content and 646 duplicate page titles. Webmaster tools is 88 and 37 respectively.
Looking into the numbers in SEOmoz again (and they've risen since the original post) there's a huge number which fall under the capitalisation discussed but also some which seem to register as HTTPS and HTTP.
-
Thanks Alan - I'll get on this...
-
Yes its seen as too different urls
http://perthseocompany.com.au/seo/reports/violation/the-page-contains-multiple-canonical-formats
If you are uisng a windows server (IIS), you can fix this easy by using the IIS url-rewrite addon. it had a rewite as lowercase preset
-
Google does count this as duplicate content. Semil is right. You want to have someone do url rewrites on the server side to 301 these to lowercase.
-
Hi LucasM,
Yes its possible by server side that you cant open a url with capital letters if you are using small letters.
But I dont think google will talke capitalisation in consideration.
Is it showing you in Google webmaster tool in duplicate titles and duplicate descriptions ?
If its showing then ask your coder to play with .htaccess to stop opening a url with different small - capital letter combination.
Thanks,
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Supplier Videos & Duplicate Content
Hi, We have some supplier videos the product management want to include on these product pages. I am wondering how detrimental this is for SEO & the best way to approach this. Do we simply embed the supplier YouTube videos, or do we upload them to our YouTube - referencing the original content & then embed our YouTube videos? Thank you!
Intermediate & Advanced SEO | | BeckyKey0 -
Questions about Event Calendar Format and Duplicate Content
Hi there: We maintain a calendar of digital events and conferences on our website here: https://splatworld.tv/events/ . We're trying to add as many events as we can and I'm wondering about the descriptions of each. We're pulling them from the conference websites, mostly, but I'm worried about the scraped content creating duplicate content issues. I've also noticed that most calendars of this type which rank well are not including actual event descriptions, but rather just names, locations and a link out to the conference website. See https://www.semrush.com/blog/the-ultimate-calendar-of-digital-marketing-events-2017/ and http://www.marketingterms.com/conferences/ . Anyone have any thoughts on this? Thanks, in ..advance..
Intermediate & Advanced SEO | | Daaveey0 -
SEO for video content that is duplicated accross a larger network
I have a website with lots of content (high quality video clips for a particular niche). All the content gets fed out 100+ other sites on various domains/subdomains which are reskinned for a given city. So the content on these other sites is 100% duplicate. I still want to generate SEO traffic though. So my thought is that we: a) need to have canonical tags from all the other domains/subdomains that point back to the original post on the main site b) probably need to disallow search engine crawlers on all the other domains/subdomains Is this on the right track? Missing anything important related to duplicate content? The idea is that after we get search engines crawling the content correctly, from there we'd use the IP address to redirect the visitor to the best suited domain/subdomain. any thoughts on that approach? Thanks for your help!
Intermediate & Advanced SEO | | PlusROI0 -
Parameter Strings & Duplicate Page Content
I'm managing a site that has thousands of pages due to all of the dynamic parameter strings that are being generated. It's a real estate listing site that allows people to create a listing, and is generating lots of new listings everyday. The Moz crawl report is continually flagging A LOT (25k+) of the site pages for duplicate content due to all of these parameter string URLs. Example: sitename.com/listings & sitename.com/listings/?addr=street name Do I really need to do anything about those pages? I have researched the topic quite a bit, but can't seem to find anything too concrete as to what the best course of action is. My original thinking was to add the rel=canonical tag to each of the main URLs that have parameters attached. I have also read that you can bypass that by telling Google what parameters to ignore in Webmaster tools. We want these listings to show up in search results, though, so I don't know if either of these options is ideal, since each would cause the listing pages (pages with parameter strings) to stop being indexed, right? Which is why I'm wondering if doing nothing at all will hurt the site? I should also mention that I originally recommend the rel=canonical option to the web developer, who has pushed back in saying that "search engines ignore parameter strings." Naturally, he doesn't want the extra work load of setting up the canonical tags, which I can understand, but I want to make sure I'm both giving him the most feasible option for implementation as well as the best option to fix the issues.
Intermediate & Advanced SEO | | garrettkite0 -
Best Way to Incorporate FAQs into Every Page - Duplicate Content?
Hi Mozzers, We want to incorporate a 'Dictionary' of terms onto quite a few pages on our site, similar to an FAQ system. The 'Dictionary' has 285 terms in it, with about 1 sentence of content for each one (approximately 5,000 words total). The content is unique to our site and not keyword stuffed, but I am unsure what Google will think about us having all this shared content on these pages. I have a few ideas about how we can build this, but my higher-ups really want the entire dictionary on every page. Thoughts? Image of what we're thinking here - http://screencast.com/t/GkhOktwC4I Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
How to Avoid Duplicate Content Issues with Google?
We have 1000s of audio book titles at our Web store. Google's Panda de-valued our site some time ago because, I believe, of duplicate content. We get our descriptions from the publishers which means a good
Intermediate & Advanced SEO | | lbohen
deal of our description pages are the same as the publishers = duplicate content according to Google. Although re-writing each description of the products we offer is a daunting, almost impossible task, I am thinking of re-writing publishers' descriptions using The Best Spinner software which allows me to replace some of the publishers' words with synonyms. I have re-written one audio book title's description resulting in 8% unique content from the original in 520 words. I did a CopyScape Check and it reported "65 duplicates." CopyScape appears to be reporting duplicates of words and phrases within sentences and paragraphs. I see very little duplicate content of full sentences
or paragraphs. Does anyone know whether Google's duplicate content algorithm is the same or similar to CopyScape's? How much of an audio book's description would I have to change to stay away from CopyScape's duplicate content algorithm? How much of an audio book's description would I have to change to stay away from Google's duplicate content algorithm?0 -
Why is Google Reporting big increase in duplicate content after Canonicalization update?
Our web hosting company recently applied a update to our site that should have rectified Canonicalized URLs. Webmaster tools had been reporting duplicate content on pages that had a query string on the end. After the update there has been a massive jump in Webmaster tools reporting now over 800 pages of duplicate content, Up from about 100 prior to the update plus it reporting some very odd pages (see attached image) They claim they have implement Canonicalization in line with Google Panda & Penguin, but surely something is not right here and it's going to cause us a big problem with traffic. Can anyone shed any light on the situation??? Duplicate%20Content.jpg
Intermediate & Advanced SEO | | Towelsrus0 -
Duplicate content for swatches
My site is showing a lot of duplicate content on SEOmoz. I have discovered it is because the site has a lot of swatches (colors for laminate) within iframes. Those iframes have all the same content except for the actual swatch image and the title of the swatch. For example, these are two of the links that are showing up with duplicate content: http://www.formica.com/en/home/dna.aspx?color=3691&std=1&prl=PRL_LAMINATE&mc=0&sp=0&ots=&fns=&grs= http://www.formica.com/en/home/dna.aspx?color=204&std=1&prl=PRL_LAMINATE&mc=0&sp=0&ots=&fns=&grs= I do want each individual swatch to show up in search results and they currently are if you search for the exact swatch name. Is the fact that they all have duplicate content affecting my individual rankings and my domain authority? What can I do about it? I can't really afford to put unique content on each swatch page so is there another way to get around it? Thanks!
Intermediate & Advanced SEO | | AlightAnalytics0