Canonical and Sitemap issue
-
Hi all,
I was told that I could change my homepage Canonical tag to match that of my XML sitemap, this sitemap is being generated for me automatically and shows the homepage as e.g. https://www.mysite.com/index.html, yet my Canonical tag has been set to https://www.mysite.com.
Google currently shows as https://www.mysite.com/ being indexed, but https://www.mysite.com/index.html is not currently displayed in search results.
Can someone please tell me if I should change the Canonical to the index.html version, or if I should do nothing, or remove the Canonical tag altogether?
Thank you for looking.
-
I agree with the others. Given "https://www.mysite.com/index.html is not currently displayed in search results", in all likelihood it is being redirected to https://www.mysite.com (and should be). So you don't want to change the canonical to the index.html version of the page only to have it redirected back to https://www.mysite.com. It'll unnecessarily slow the site and might even create a loop.
-
Thank you both, I'll leave it as it is, I'm not able to edit the XML my side sadly.
-
Yes, that's a good point. Canonicals are suggestions for Google, not commands.
-
I see your point, and don't worry about it. Sitemaps help Google find all of your pages and can provide certain other information, but they are not required so no need to overthink them. In general Google is pretty good at finding what it needs to find. And it will certainly find your homepage.
-
I agree with Linda here, I would leave the canonical tag as is. It is a cleaner, better looking URL for the SERPs. If anything, manually update the XML file to reflect the canonical version of the homepage. The main purpose of the XML sitemap is to help search engines crawl and index a website. The homepage is going to be the most frequently crawled page so Google will not have a problem finding it.
Also, do not worry about Google disliking the canonical pointing to .com instead of /index.html. If Google determines that is not the ideal URL for it's index it will ignore the canonical tag.
-
Hi,
Thanks, basically I was concerned that Google may not like that https://www.mysite.com/ was not in the sitemap, yet index.html was and the canonical was pointing to https://www.mysite.com.
If that makes any sense....
-
What are you trying to achieve? Do you particularly want the index.html version to be the canonical? The https://www.mysite.com/ version is more straightforward and what most people would expect your homepage URL to be.
Unless there is some pressing reason to do otherwise, I'd leave it the way it is.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap created on client's Joomla site but it is not showing up on site reports as existing? (Thumbs Up To Answers)
I am working with a web developer who built our client's site in Joomla. I seem to have a lot of issues with Joomla based sites. Any how, the site is www.pitgearusa.com and when we run site reports it is showing there is no xml sitemap. However he used a popular Joomla plugin for sitemaps called Xmap. Here is their url: http://www.jooxmap.com/ Can anyone provide any advice on what the website developer needs to do in order for the xml sitemap to function and "show up" on reports? Thanks Mashed Up
Web Design | | Atlanta-SMO0 -
Best way to indicate multiple Lang/Locales for a site in the sitemap
So here is a question that may be obvious but wondering if there is some nuance here that I may be missing. Question: Consider an ecommerce site that has multiple sites around the world but are all variations of the same thing just in different languages. Now lets say some of these exist on just a normal .com page while others exist on different ccTLD's. When you build out the XML Sitemap for these sites, especially the ones on the other ccTLD's, we want to ensure that using <loc>http://www.example.co.uk/en_GB/"</loc> <xhtml:link<br>rel="alternate"
Web Design | | DRSearchEngOpt
hreflang="en-AU"
href="http://www.example.com.AU/en_AU/"
/>
<xhtml:link<br>rel="alternate"
hreflang="en-NZ"
href="http://www.example.co.NZ/en_NZ/"
/> Would be the correct way of doing this. I know I have to change this for each different ccTLD but it just looks weird when you start putting about 10-15 different language locale variations as alternate links. I guess I am just looking for a bit of re-affirmation I am doing this right.</xhtml:link<br></xhtml:link<br> Thanks!0 -
ECWID How to fix Duplicate page content and external link issue
I am working on a site that has a HUGE number of duplicate pages due to ECWID ecommerce platform. The site is built with Joomla! How can I rectify this situation? The pages also show up as "external " links on crawls... Is it the ECWID platform? I have never worked on a site that uses this. Here is an example of a page with the issue (there are 6280 issues) URL: http://www.metroboltmi.com/shop-spare-parts?Itemid=218&option=com_rokecwid&view=ecwid&ecwid_category_id=3560081
Web Design | | Atlanta-SMO0 -
SEO Issues From Image Hotlinking?
I have a client who is hotlinking their images from one of their domains. I'm assuming the images were originally stored on the first domain (let's call it SiteA.com) and when they were putting together SiteB.com, they decided to just link to the images directly on SiteA.com instead of moving the images to Site B. Essentially hotlinking. Site A is not using the images in any way and in essence is just a gateway for their other sites and in this case a storage for their images. It doesn't use those images at all, so it really doesn't get any benefits of the images being referenced since I read that Google sometimes counts that hotlinking as a "vote" for the original image. But again, since ite A doesn't use the images that are being hotlinked at all, there's no benefit for Site A. My concern is that it's affecting their SEO for Site B because it makes it look like Site B is simply scraping data by hotlinking those images from Site A. Their programmer suggested creating a virtual directory so that it "looked" like it was coming from Site B. My guess is that Google can see this, so then not only will it look like Site B is scaping/hotlinking images, but also trying to hide it which may send up red flags to Google. My suggesstion to them was to just upload the images correctly into their own images directory on Site B. They own the images, so there's not any copyright issue, but that if they want proper SEO credit for that content, it all needs to be housed on the correct server and not hotlinked. Am I correct in this or will the virtual directory serve just as well?
Web Design | | GeorgiaSEOServices1 -
Best Practice issue: Modx vs Wordpress
Lately I've been working a lot with Modx to create a new site for our own firm as well for other projects. But so far I haven't seen the advantages for SEO purposes other then the fact that with ModX you can manage almost everything yourself including snippets etc without to much effort. Wordpress is a known factor for blogging and since the last 2 years or so for websites. My question is: Which platform is better suited for SEO purposes? Which should I invest my time in? ModX or Wordpress? Hope to hear your thought on the matter
Web Design | | JarnoNijzing0 -
Best Way to Remove Mutltiple XML Sitemaps From Multiple Subdomains
Just found a series of of XML sitemaps hosted like so: http://www.thesite.anothersite.com/sitemap.xml and defaulted to remove and 301 redirect but as this is the first time I've encountered an issue like this, an outside opinion or two would be much appreciated. Is the 301 the best option, should I 404 them or what?
Web Design | | ePageCity0 -
Google search issue with exact domain
We had a site from Feb-2011 to Nov-2011 at the domain amcoexterminating.com. The site was pure HTML/CSS and the daily unique visitors steadily increased over that time. So all was fine. We then moved the site to a CMS (Joomla) on Dec. 6th. From that day forward, the daily visitors went into the tank. Before the move, if you typed "amcoexterminating.com" or "amco exterminating" into Google search, the site would be the first result (as you'd expect since those are the words that make up the actua domain). But we tried this yesterday and the site did not come up at all. NOT GOOD. It would work in Yahoo or Bing, but not in Google. So obviously, the problem with Google search directly affected the daily visitors. We just checked Webmaster tools yesterday (yes, this should have been done sooner, lesson learned) and it said "Site has severe health issues - Important page blocked by robots.txt". It listed the "important" page URL and it was just a link to an image. Regardless, I wiped out the Joomla created robots.txt file and added a new one and made it just say... User-agent: *Allow: / About 14 hours later, after the new robots.txt file was recognized by Google, the "severe health" message went away. However if I search in Google for "amcoexterminating.com", it still doesn't show up and the client is concerned (as they should be). Do you think the search engines just need more time to refresh? If so, once it refreshes, should the site show up first again right away? Or is it possible the robots.txt file had nothing to do with the issue? If so, what other things could I check into that might cause Google search to not find a site even if you search for exact domain name? Please share any and all things I should look into as I need to get this site showing in Google search again (as it was before moving to the CMS). Thanks!
Web Design | | MarathonMS0 -
Crawl Budget vs Canonical
Got a debate raging here and I figured I'd ask for opinions. We have our websites structured as site/category/product This is fine for URL keywords, etc. We also use this for breadcrumbs. The problem is that we have multiple categories into which a category fits. So "product" could also be at site/cat1/product
Web Design | | Highland
site/cat2/product
site/cat3/product Obviously this produces duplicate content. There's no reason why it couldn't live under 1 URL but it would take some time and effort to do so (time we don't necessarily have). As such, we're applying the canonical band-aid and calling it good. My problem is that I think this will still kill our crawl budget (this is not an insignificant number of pages we're talking about). In some cases the duplicate pages are bloating a site by 500%. So what say you all? Do we just simply do canonical and call it good or do we need to take into account the crawl budget and actually remove the duplicate pages. Or am I totally off base and canonical solves the crawl budget issue as well?0