Include Cross Domain Canonical URL's in Sitemap - Yes or No?
-
I have several sites that have cross domain canonical tags setup on similar pages. I am unsure if these pages that are canonicalized to a different domain should be included in the sitemap. My first thought is no, because I should only include pages in the sitemap that I want indexed.
On the other hand, if I include ALL pages on my site in the sitemap, once Google gets to a page that has a cross domain canonical tag, I'm assuming it will just note that and determine if the canonicalized page is the better version. I have yet to see any errors in GWT about this. I have seen errors where I included a 301 redirect in my sitemap file. I suspect its ok, but to me, it seems that Google would rather not find these URL's in a sitemap, have to crawl them time and time again to determine if they are the best page, even though I'm indicating that this page has a similar page that I'd rather have indexed.
-
I looked at the sitemap, and they are including the http://www.seomoz.org/blog/the-story-of-seomoz but not the canonical page - http://www.masternewmedia.org/entrepreneurship-the-full-story-of-seomoz-told-by-rand-fishkin/
So based on this example, the page on SEOMoz is still included in the sitemap, regardless if it has a canonical or not.
This seems to make sense, since canonical links are used only as a hint and not an absolute directive.
I also noticed that Google is choosing to index and rank both pages, on Page 1.
SEOMoz is ranking higher on my browser for "the full story of seomoz". A few things going on here.
-
Why is google choosing to rank SEOMoz higher than Mastermedia.org for this page? There's a canonical setup, but google is choosing not to follow it. (again its a hint not an absolute) this doesn't always work.
-
I would think Google would be able to filter out the duplicate content easy. In this example, they are clearly not. SEOMoz is ranking #4 and Masternewmedia.org is ranking #5 for query "the full story of seomoz"
-
-
Right - as far as I know, you're supposed to put end URLs into a sitemap, not urls which 301 redirect. Cross domain canonical is still kind of new, but I would treat them as a 301 redirect and not include them in a sitemap.
Now, if you're curious, SEO Moz did a whiteboard Friday where they talked about this same exact issue (cross domain canonical), and as an experiment, re-posted a blog article from another blogger on SEO Moz.
http://www.seomoz.org/blog/cross-domain-canonical-the-new-301-whiteboard-friday
http://www.seomoz.org/blog-sitemap.xml
http://www.seomoz.org/blog/the-story-of-seomoz
The blog is still included in the blog sitemap. I think it probably won't 'hurt' to keep those pages in the sitemap, since a lot of sitemaps automatically generated CMS tools won't have been updated to deal with this yet.
-
There is no BIG problem if you add the pages that contain cross domain canonical tag on them. Why?
The reason why I can say this is because Google is not only indexing the pages from sitemap.xml file, Google have their own crawler and they have the ability to crawl and index the website no matter if you do not have an xml sitemap.
Google is very good at (in my opinion) picking the instructions that are available on the page so if you add the page in the xml sitemap, the crawler will read the instructions on the page and will only index the page that contain original content.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Magento 1.9 SEO. I have product pages with identical On Page SEO score in the 90's. Some pull up Google page 1 some won't pull up at all. I am searching for the exact title on that page.
I have a website built on Magento 1.9. There are approximately 290,000 part numbers on the site. I am sampling Google SERP results. About 20% of the keywords show up on page 1 position 5 thru 10. 80% don't show up at all. When I do a MOZ page score I get high 80's to 90's. A page score of 89 on one part # may show up on page one, An identical page score on a different part # can't be found on Google. I am searching for the exact part # in the page title. Any thoughts on what may be going on? This seems to me like a Magento SEO issue.
Intermediate & Advanced SEO | | CTOPDS0 -
Domain Change for a well positioned website... I'm a-scared
Hello, A few years ago I have "inherited" a website about a particular touristic area in Italy (the Langhe region) called langhe.net. The website is very well positioned, the domain has been registered in '97 and the overall SEO performance is pretty good (it ranks in the top #3 positions for all the main search queries in our niche). We are currently redesigning the whole thing, and one of the idea was to change the domain (and the name) of the website from langhe.net to lovelanghe.com (which we already registered). The reasons behind this decision are the following (most important first): Google prefer brands over keywords and "Langhe" is just a keyword LoveLanghe looks more memorable and "marketable" than just Langhe.net All our social presence is branded already as LoveLanghe (they were created years back under this name - I don't know why) We will do our due diligence work (301 everything, domain change in Search Console etc. etc.) but I'm still kind of worried that we will lose some ranking. So my question(s) are: do you think it's a good idea to change the domain when ranking is good and original domain is so old? how much ranking (approximately) are we going to lose? Thanks in advance 🙂 Best
Intermediate & Advanced SEO | | Enrico_Cassinelli0 -
Interesting Cross Domain Canonical Quirk...
We recently ran cross domain canonicals for 2 of our websites. What's interesting is that when I do a search for ""site:domain1.com "product name"" the Title in the SERPs uses the Domain Name from the site the page has been canonicaled to. So the title for Domain1 (for the search term above) looks like this: Product Name | Keywords | Domain 2 Interesting quirk. Ha anyone else seen this?
Intermediate & Advanced SEO | | AMHC0 -
Does Google Read URL's if they include a # tag? Re: SEO Value of Clean Url's
An ECWID rep stated in regards to an inquiry about how the ECWID url's are not customizable, that "an important thing is that it doesn't matter what these URLs look like, because search engines don't read anything after that # in URLs. " Example http://www.runningboards4less.com/general-motors#!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 Basically all of this: #!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 That is a snippet out of a conversation where ECWID said that dirty urls don't matter beyond a hashtag... Is that true? I haven't found any rule that Google or other search engines (Google is really the most important) don't index, read, or place value on the part of the url after a # tag.
Intermediate & Advanced SEO | | Atlanta-SMO0 -
Transferring link juice from a canonical URL to an SEO landing page.
I have URLs that I use for SEM ads in Google. The content on those pages is duplicate (affiliate). Those pages also have dynamic parameters which caused lots of duplicate content pages to be indexed. I have put a canonical tag on the Parameter pages to consolidate everything to the canonical URL. Both the canonical URL and the Parameter URLs have links pointing to them. So as it stands now, my canonical URL is still indexed, but the parameter URLs are not. The canonical page is still made up of affiliate (duplicate) content though. I want to create an equivalent SEO landing page with unique content. But I'd like to do two things 1) remove the canonical URL from the index - due to duplicate affiliate content, and 2) transfer the link juice from the canonical URL over to the SEO URL. I'm thinking of adding a meta NoIndex, follow tag to the canonical tag - and internally linking to the new SEO landing page. Does this strategy work? I don't want to lose the link juice on the canonical URL by adding a meta noindex tag to it. Thanks in advance for your advice. Rob
Intermediate & Advanced SEO | | partnerf0 -
Cross Domain Rel Canonical for Affiliates?
Hi We use the Cross Domain Rel Canonical for duplicate content between our own websites, but what about affiliates sites who want our XML feed, (descriptions of our products). We don´t mind being credited but would this present a danger for us? Who is controlling the use of that cross domain rel canonical, us in our feed or them? Is there another way around it?
Intermediate & Advanced SEO | | xoffie0 -
Hundreds of thousands of 404's on expired listings - issue.
Hey guys, We have a conundrum, with a large E-Commerce site we operate. Classified listings older than 45 days are throwing up 404's - hundreds of thousands, maybe millions. Note that Webmaster Tools peaks at 100,000. Many of these listings receive links. Classified listings that are less than 45 days show other possible products to buy based on an algorithm. It is not possible for Google to crawl expired listings pages from within our site. They are indexed because they were crawled before they expired, which means that many of them show in search results. -> My thought at this stage, for usability reasons, is to replace the 404's with content - other product suggestions, and add a meta noindex in order to help our crawl equity, and get the pages we really want to be indexed prioritised. -> Another consideration is to 301 from each expired listing to the category heirarchy to pass possible link juice. But we feel that as many of these listings are findable in Google, it is not a great user experience. -> Or, shall we just leave them as 404's? : google sort of says it's ok Very curious on your opinions, and how you would handle this. Cheers, Croozie. P.S I have read other Q & A's regarding this, but given our large volumes and situation, thought it was worth asking as I'm not satisfied that solutions offered would match our needs.
Intermediate & Advanced SEO | | sichristie0 -
Limiting URLS in the HTML Sitemap?
So I started making a sitemap for our new golf site, which has quite a few "low level" pages (about 100 for the golf courses that exist in the area, and then about 50 for course architects), etc etc. My question/open discussion is simple. In a sitemap that already has about 50 links, should we include these other low level 150 links? Of course, the link to the "Golf Courses" is there, along with a link to the "Course Architects" MAIN pages (which, subdivides on THOSE pages.) I have read the limit is around 150 links on the sitemap.html page and while it would be nice to rank long tail for the Golf Courses. All in all, our site architecture itself is easily crawlable as well. So the main question is just to include ALL the links or just the main ones? Thoughts?
Intermediate & Advanced SEO | | JamesO0