Rel=canonical - Identical .com and .us Version of Site
-
We have a .us and a .com version of our site that we direct customers to based on location to servers. This is not changing for the foreseeable future.
We had restricted Google from crawling the .us version of the site and all was fine until I started to see the https version of the .us appearing in the SERPs for certain keywords we keep an eye on.
The .com still exists and is sometimes directly above or under the .us. It is occasionally a different page on the site with similar content to the query, or sometimes it just returns the exact same page for both the .com and the .us results. This has me worried about duplicate content issues.
The question(s): Should I just get the https version of the .us to not be crawled/indexed and leave it at that or should I work to get a rel=canonical set up for the entire .us to .com (making the .com the canonical version)? Are there any major pitfalls I should be aware of in regards to the rel=canonical across the entire domain (both the .us and .com are identical and these newly crawled/indexed .us pages rank pretty nicely sometimes)? Am I better off just correcting it so the .us is no longer crawled and indexed and leaving it at that?
Side question: Have any ecommerce guys noticed that Googlebot has started to crawl/index and serve up https version of your URLs in the SERPs even if the only way to get into those versions of the pages are to either append the https:// yourself to the URL or to go through a sign in or check out page? Is Google, in the wake of their https everywhere and potentially making it a ranking signal, forcing the check for the https of any given URL and choosing to index that?
I just can't figure out how it is even finding those URLs to index if it isn't seeing http://www.example.com and then adding the https:// itself and checking...
Help/insight on either point would be appreciated.
-
Rel=canonical is great for helping search engines serve the correct language or regional URL to searchers, but I'm not sure how it would work for two sites both purposed for the US (.us and .com).
What's the thought behind having two sites - is the .us site intended for Google US searches and .com the default for anything outside of the US? Are there language variations? What are the different "locations" you're referring to?
-
I would set sitewide canonicals from both versions to the .com site. I wouldn't block any pages since people might still stumble and link back to the .us version.
I'm not positive about google auto-checking https versions of websites without any direction but it could be plausible. I know a common way that Google finds https urls is by going to the "My Account" or "My Cart" page which is https, which then changes any relative URLs from http to https, go G re-crawls all of those. Maybe that's what is happening on your end?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Three version of english pages: EN-US, EN-GB und EN as x-default
We want address the search market for USA and UK. Therefore all english pages have small regional variations with similar content. Since a longer time (after a relaunch) Google has problems to identify the right page (/en-gb/) for the right search market (UK) - although we use hreflang and sitemaps from the beginning. We monitor those in moz for our UK campaign (/en-gb/ pages) by jumps in the ranking of individual keywords (>-50 and >+50). -50 means not that the ranking of our website is lost. In this case Google will substitute the ranking of the /en-gb/ page with the variant /en/. One excample:
Technical SEO | | PeterGolze
https://www.openmind-tech.com/en-gb/industries/cam-software-for-motor-sports/
This page lost the ranking and the other languag variant is ranking for position 2.
https://www.openmind-tech.com/en/industries/cam-software-for-motorsport/ In the moment I have no idea what we can change in our html code.0 -
Site redesign makes Moz Site Crawl go haywire
I work for an agency. Recently, one of our clients decided to do a complete site redesign without giving us notice. Shortly after this happened, Moz Site Crawl reported a massive spike of issues, including but not limited to 4xx errors. However, in the weeks that followed, it seemed these 4xx errors would disappear and then a large number of new ones would appear afterward, which makes me think they're phantom errors (and looking at the referring URLs, I suspect as much because I can't find the offending URLs). Is there any reason why this would happen? Like, something wrong with the sitemap or robots.txt?
Technical SEO | | YYSeanBrady1 -
Rel Canonical, Follow/No Follow in htaccess?
Very quick question, are rel canonical, follow/no follow tags, etc. written in the htaccess file?
Technical SEO | | moon-boots0 -
Is it better to use XXX.com or XXX.com/index.html as canonical page
Is it better to use 301 redirects or canonical page? I suspect canonical is easier. The question is, which is the best canonical page, YYY.com or YYY.com/indexhtml? I assume YYY.com, since there will be many other pages such as YYY.com/info.html, YYY.com/services.html, etc.
Technical SEO | | Nanook10 -
Site Map
For a long time our site map used to be http://www.efurniturehouse.com/sitemap.xml recently our hosting company changed the site map to: http://www.efurniturehouse.com/xml-sitemap.ashx I went ahead and submitted the new site maps to both Google Webmaster and Bing. I submitted the Google one on Monday and it states PENDING. ( A day later this pending) I just submitted the map to Bing. I now have 2 site maps on each. 1)Is having 2 a problem Will they ignore the old site map or can we delete and if so when can we delete I appreciate your input Regards Tony www.eFurnitureHouse.com
Technical SEO | | OCFurniture0 -
Google Cache Version and Text Only Version are different
Across various websites we found Google cache version in the browser loads the full site and all content is visible. However when we try to view TEXT only version of the same page we can't see any content. Example: we have a client with JS scroller menu on the home page. Each scroller serves a separate content section on the same URL. When we copy paste some of the page content in Google, we can see that copy indexed in Google search results as well as showing in Cache version . But as soon as we go into Text Only version we cant see the same copy. We would like to know which version we should trust, Google cache version or the TEXT only version.
Technical SEO | | JamesDixon700 -
Want to Target Mobile site for Google Mobile Version and Desktop Site for Google Desktop Version
I have ecommerce site with both mobile version and desktop version. Mobile version starts with m.example.com and full version starts with www.example.com I am using same content through out both site and using 301 redirection by detecting user agent vice-versa. My both sites are accessible to crawl by any google spider. I have submitted both sites's sitemap to GWT and mobile site having mobile sitemap xml, so google can easily recognize my mobile site. Is it going to help to rank my both sites as per my expectation? I need to rank for mobile site in Google mobile and ranking for desktop site in Google desktop version. Some of pages of my mobile site are started to appearing in Google desktop version. So how I can stop them to appear in Google desktop? Your comments are highly welcome.
Technical SEO | | Hexpress0 -
Querystring params, rel canonical and SEO
I know ideally you should have as clean as possible url structures for optimal SEO. Our current site contains clean urls with very minimal use of query string params. There is a strong push, for business purposes to include click tracking on our site which will append a query string param to a large percentage of our internal links. Currently: http://www.oursite.com/section/content/ Will change to: http://www.oursite.com/section/content/?tg=zzzzwww We currently use rel canonical on all pages to properly define the true url in order to remove any possible duplicate content issues. Given we are already using rel canonical, if we implement the query string click tracking, will this negatively impact our SEO? If so, by how much? Could we run into duplicate content issues? We get crawled by Google a lot (very big site) and very large percent of our traffic is from Google, but there is a strong business need for this information so trying to weigh pros/cons.
Technical SEO | | NicB10