If Google's index contains multiple URLs for my homepage, does that mean the canonical tag is not working?
-
I have a site which is using canonical tags on all pages, however not all duplicate versions of the homepage are 301'd due to a limitation in the hosting platform. So some site visitors get www.example.com/default.aspx while others just get www.example.com. I can see the correct canonical tag on the source code of both versions of this homepage, but when I search Google for the specific URL "www.example.com/default.aspx" I see that they've indexed that specific URL as well as the "clean" one. Is this a concern... shouldn't Google only show me the clean URL?
-
In most cases, Google does seem to "de-index" the non-canonical URL, if they process they tag. I put in quotes just because, technically, the page is still in Google's index - as soon as it's not showing up at all (including with "site:"), though, I essentially consider that to be de-indexed. If we can't see it, it might as well not be there.
If 301-ing isn't an option, I'd double-check a few things:
(1) Is the non-canonical page ranking for anything (including very long-tail terms)?
(2) Are there any internal links to the non-canonical URL? These can send a strongly mixed signal.
(3) Are there any other mixed signals that might be throwing off the canonical? Examples include canonicals on other pages that contradict this one, 301s/302s that override the canonical, etc.
-
As Digital-Diameter said, the best choice for fixing this problem is a 301. A Canonical tag can eventually lead to the incorrect URL being replaced by the correct one in the SERPs but it is also important to note that the Rel=canonical tag is a suggestion, not a directive. What this means is that the search engines will take it into consideration but may choose not to follow it.
-
Technically, rel=canonical tags can still leave a page indexed, they simply pass authority for Google. From your question I can tell you know this, but I do have to say that 301's are the best way to address this. Blocking a page with robots.txt can help as well, but this just stops Google from crawling a page, the page can still be indexed again.
If you have pages or versions of pages that you do not want indexed you may want to use the no index meta tag. Google's notes here. Be careful though, this will stop these pages from being indexed, but they will still be crawled (though your rel=canonical solution should make this a non-issue).
A few other notes:
In all cases, be sure your internal links point consistently to the URL version you have determined for your home page.
WMT also creates a list of inbound links that are missing or broken. You can use this to help determine any additional 301s that you need.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index subpages but not homepage
Hello, Google index all my subpages, but not the homepage, how is this possible ? https://www.google.nl/#q=http:%2F%2Fwww.lavistarelatiegeschenken.nl He index the homepage of the neigbour country in the same language How possible ? Gr Marcel
Technical SEO | | Bossie720 -
Google's ability to crawl AJAX rendered content
I would like to make a change to the way our main navigation is currently rendered on our e-commerce site. Currently, all of the content that appears when you click a navigation category is rendering on page load. This is currently a large portion of every page visit’s bandwidth and even the images are downloaded even if a user doesn’t choose to use the navigation. I’d like to change it so the content appears and is downloaded only IF the user clicks on it, I'm planning on using AJAX. As that is the case it wouldn’t not be automatically on the site(which may or may not mean Google would crawl it). As we already provide a sitemap.xml for Google I want to make sure this change would not adversely affect our SEO. As of October this year the Webmaster AJAX crawling doc. suggestions has been depreciated. While the new version does say that its crawlers are smart enough to render AJAX content, something I've tested, I'm not sure if that only applies to content injected on page load as opposed to in click like I'm planning to do.
Technical SEO | | znotes0 -
The importance of url's - are they that important?
Hi Guys I'm reading some very contrasting and confusing reviews regarding urls and the impact they have on a sites ability to rank. My client has a number of flooring products, 71 to be exact - categorised under three sub categories 1. Gallery Wood - 2. Prefinshed Wood - 3. Parquet & Reclaimed. All of the 71 products are branded products (names that are completely unrelated to specific keyword search terms. This is having a major impact regarding how we optimise the site. FOR EXAMPLE: A product of the floor called "White Grain" - the "Key Word" we would like to rank this page for is Brown Engineered Flooring. I'm interested to know, should the name of the branded product match the url? What would you change to help this page rank better for the keyword - Brown Engineered Flooring. Title page: White Grain Url: thecompanyname.com/gallery-wood/white-grain (white grain is the name of the product) Key Word: Brown Engineered Flooring **Seo Title: **White Grain, Brown Engineered Flooring by X Meta Description: BLAH BLAH Brown Engineered Flooring BLAH BLAH Any feedback to help get my head around this would be really appreciated. Thank you.
Technical SEO | | GaryVictory0 -
OSE says URL redirects to URL with trailing slash but it doesn't.
Site is www.example.com/folder/us and OSE says this URL redirects to www.example.com/folder/us/, but it does not. When I look at the OSE report for the latter version with the "/" it says "No Data Available For This URL". Why would that be? The original URL is www.example.com and it redirects to www.example.com/folder/us. Is this anything I need to worry about? I thought that the trailing / doesn't really mean much anymore but nonetheless, why does it think it redirects there?
Technical SEO | | rock220 -
Speed up the process of removing URLs from Google Index
Hi guys, We have done some work to try to remove pages from Google index. We have done the following: 1. Noindex tag 2. Make pages returning a 404 response. Is there anyway to notify Google about these changes so we can speed up the process of removing these pages from Google index? Also regarding the URL removal tool, Google says that it's used to remove URLs from search results, does it mean the URLs are removed from their index too? Many thanks guys David
Technical SEO | | sssrpm0 -
Can JavaScrip affect Google's index/ranking?
We have changed our website template about a month ago and since then we experienced a huge drop in rankings, especially with our home page. We kept the same url structure on entire website, pretty much the same content and the same on-page seo. We kind of knew we will have a rank drop but not that huge. We used to rank with the homepage on the top of the second page, and now we lost about 20-25 positions. What we changed is that we made a new homepage structure, more user-friendly and with much more organized information, we also have a slider presenting our main services. 80% of our content on the homepage is included inside the slideshow and 3 tabs, but all these elements are JavaScript. The content is unique and is seo optimized but when I am disabling the JavaScript, it becomes completely unavailable. Could this be the reason for the huge rank drop? I used the Webmaster Tolls' Fetch as Googlebot tool and it looks like Google reads perfectly what's inside the JavaScrip slideshow so I did not worried until now when I found this on SEOMoz: "Try to avoid ... using javascript ... since the search engines will ... not indexed them ... " One more weird thing is that although we have no duplicate content and the entire website has been cached, for a few pages (including the homepage), the picture snipet is from the old website. All main urls are the same, we removed some old ones that we don't need anymore, so we kept all the inbound links. The 301 redirects are properly set. But still, we have a huge rank drop. Also, (not sure if this important or not), the robots.txt file is disallowing some folders like: images, modules, templates... (Joomla components). We still have some html errors and warnings but way less than we had with the old website. Any advice would be much appreciated, thank you!
Technical SEO | | echo10 -
Are URL's with trailing slash seen as two different URLs
Hello, http://www.example.com and http://ww.example.com/ Are these seen as two different URL's ? Just as with www or non www ? Or it doesn't make any difference ?
Technical SEO | | seoug_20050 -
URL's for news content
We have made modifications to the URL structure for a particular client who publishes news articles in various niche industries. In line with SEO best practice we removed the article ID from the URL - an example is below: http://www.website.com/news/123/news-article-title
Technical SEO | | mccormackmorrison
http://www.website.com/news/read/news-article-title Since this has been done we have noticed a decline in traffic volumes (we have not as yet assessed the impact on number of pages indexed). Google have suggested that we need to include unique numerical IDs in the URL somewhere to aid spidering. Firstly, is this policy for news submissions? Secondly (if the previous answer is yes), is this to overcome the obvious issue with the velocity and trend based nature of news submissions resulting in false duplicate URL/ title tag violations? Thirdly, do you have any advice on the way to go? Thanks P.S. One final one (you can count this as two question credits if required), is it possible to check the volume of pages indexed at various points in the past i.e. if you think that the number of pages being indexed may have declined, is there any way of confirming this after the event? Thanks again! Neil0