Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Case Sensitive URLs, Duplicate Content & Link Rel Canonical
-
I have a site where URLs are case sensitive. In some cases the lowercase URL is being indexed and in others the mixed case URL is being indexed. This is leading to duplicate content issues on the site.
The site is using link rel canonical to specify a preferred URL in some cases however there is no consistency whether the URLs are lowercase or mixed case. On some pages the link rel canonical tag points to the lowercase URL, on others it points to the mixed case URL.
Ideally I'd like to update all link rel canonical tags and internal links throughout the site to use the lowercase URL however I'm apprehensive!
My question is as follows:
If I where to specify the lowercase URL across the site in addition to updating internal links to use lowercase URLs, could this have a negative impact where the mixed case URL is the one currently indexed?
Hope this makes sense!
Dave
-
Thanks Hutch and Ryan, great answers and exactly what I need. Appreciate it!
-
In addition to using rel=canonical for the lowercase version of your URLs you should also consider implementing redirection from uppercase to lowercase. A regex expert should be able to write the redirect script you'll need to add to your .htaccess file in order to change upper to lowercase. Cheers!
-
While you might see a small down tick in the beginning as Google indexes and updates, overall you should not suffer any long term negative effects from standardizing your URI structure.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Title tags even with rel=canonical
Hello, We were having duplicate content in our blog (a replica of each post automatically was done by the CMS), until we recently implemented a rel=canonical tag to all the duplicate posts (some 5 weeks ago). So far, no duplicate content were been found, but we are still getting duplicate title tags, though the rel=canonical is present. Any idea why is this the case and what can we do to solve it? Thanks in advance for your help. Tej Luchmun
Intermediate & Advanced SEO | | luxresorts0 -
How to deal with URLs and tabbed content
Hi All, We're currently redesigning a website for a new home developer and we're trying to figure out the best way to deal with tabbed content in the URL structure. The design of the site at the moment will have a page for a development and within that you can select your house type, then when on the house type page there will be tabs displayed for the user to see things like the plot map, availability and pricing, specifications, etc. The way our development team are looking at handling this is for the URL to use a hashtag or a query string at the end of it so we can still land users on these specific tabs for PPC for example. My question is really, has anyone had any experience with this? Any recommendations on how to best display the urls for SEO? Thanks
Intermediate & Advanced SEO | | J_Sinclair0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
Yoast & rel canonical for paginated Wordpress URLs
Hello, our Wordpress blog at http://www.jobs.ca/career-resources has a rel canonical issue since we added pagination to the front page and category-pages. We're using Yoast and it's incorrectly applying a rel-canonical meta tag referencing page 1 on page 2, 3, etc. This is a known misuse of the rel-canonical tag (per Google's Webmaster Blog - http://googlewebmastercentral.blogspot.ca/2013/04/5-common-mistakes-with-relcanonical.html, which says rel-canonical should be replaced with rel-prev and rel-next for page 2, 3, etc.). We don't see a way to specify anywhere in Yoast's options to correct this behaviour for page 2, 3, etc. Yoast allows you to override a page's canonical URL, otherwise it automatically uses the Wordpress permalink. My question is, does anyone know how to configure Yoast to properly replace rel-canonical tags with rel-prev and rel-next for paginated URLs, or do I need to look at another plugin or customize the behavior directly in my child theme code? This issue was brought up here as well: http://moz.com/community/q/canonical-help, but the only response did not relate to Yoast. (We're using Wordpress 3.6.1 and Yoast "Wordpress SEO" 1.4.18)
Intermediate & Advanced SEO | | aactive0 -
Meta tags - are they case sensitive?
I just ran the wordtracker tool and noticed something interesting. The tool didn't pick up our meta description. It's strange as our meta descriptions appear in organic search results and Moz never reported missing meta descriptions.After reviewing other pages, I noticed our meta description tag is written as the following: name="Description" content=" I never thought about this, but are meta tags case sensitive? Should it be written as: name="description" content=" Thoughts?
Intermediate & Advanced SEO | | Bio-RadAbs0 -
Should canonical links be included or excluded in a sitemap?
Our company is in the process of updating our sitemap. Should we include or exclude canonical links.
Intermediate & Advanced SEO | | WebRiverGroup0 -
Rel=canonical tag on original page?
Afternoon All,
Intermediate & Advanced SEO | | Jellyfish-Agency
We are using Concrete5 as our CMS system, we are due to change but for the moment we have to play with what we have got. Part of the C5 system allows us to attribute our main page into other categories, via a page alaiser add-on. But what it also does is create several url paths and duplicate pages depending on how many times we take the original page and reference it in other categories. We have tried C5 canonical/SEO add-on's but they all seem to fall short. We have tried to address this issue in the most efficient way possible by using the rel=canonical tag. The only issue is the limitations of our cms system. We add the canonical tag to the original page header and this will automatically place this tag on all the duplicate pages and in turn fix the problem of duplicate content. The only problem is the canonical tag is on the original page as well, but it is referencing itself, effectively creating a tagging circle. Does anyone foresee a problem with the canonical tag being on the original page but in turn referencing itself? What we have done is try to simplify our duplicate content issues. We have over 2500 duplicate page issues because of this aliasing add-on and want to automate the canonical tag addition, rather than go to each individual page and manually add this tag, so the original reference page can remain the original. We have implemented this tag on one page at the moment with 9 duplicate pages/url's and are monitoring, but was curious if people had experienced this before or had any thoughts?0 -
How Rel=Prev & Rel=Next work for me?
I have implemented Rel=Prev & Rel=Next tag on my website. I would like to give example URL to know more about it. http://www.vistapatioumbrellas.com/market-umbrellas?limit=40&p=3 http://www.vistapatioumbrellas.com/market-umbrellas?limit=40&p=4 http://www.vistapatioumbrellas.com/market-umbrellas?limit=40&p=5 Right now, I have blocked paginated pages by Robots.txt by following query. Disallow: /*?p= I have removed disallow syntax from Robots.txt for paginated pages. But, I have confusion with duplicate page title. If you will check all 3 pages so you will find out duplicate page title across all pages. I know that, duplicate page title is harmful for SEO. Will Google crawl + index all paginated pages? If yes so which page will get maximum benefits in organic ranking? Is there any specific way which may help me to solve this issue?
Intermediate & Advanced SEO | | CommercePundit0