Is there a way to keep sitemap.xml files from getting indexed?
-
Wow, I should know the answer to this question.
Sitemap.xml files have to be accessible to the bots for indexing they can't be disallowed in robots.txt and can't block the folder at the server level.
So how can you allow the bots to crawl these xml pages but have them not show up in google's index when doing a site: command search, or is that even possible? Hmmm
-
ahhh no index in .htaccess file. brilliant - thanks!
-
Usually you would need to add noindex to the meta robots tag in the of your web page. Because your sitemap is an XML file and not HTML you will need to do things differently.
You can add the code below to your .htaccess file, which you can find in the ROOT folder of your server. Open the file in a plain text editor and insert the following:
Header set X-Robots-Tag "noindex"
This will stop search engines indexing your sitemap without restricting them from crawling it.
Note: Replace 'sitemap.xml' with your file name - if different.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexing is live what about rankings ?
I noticed that when I request indexing in the webmaster tool my new content is live within minutes. Does it take longer to update the ranking or is the ranking updated as soon as the new page has been indexed. Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Any Good XML Sitemaps Generator?
I was wondering if everyone could recommend what XML Sitemap generators they use. I've been using XML-Sitemap, and it's been a little hit and miss for me. Some sites it works great, other it really has serious problems indexing pages. I've also uses Google's, but unfortunately it's not very flexible to use. Any recommendation would be much appreciated.
Intermediate & Advanced SEO | | alrockn0 -
XML sitemaps questions
Hi All, My developer has asked me some questions that I do not know the answer to. We have both searched for an answer but can't find one.... So, I was hoping that the clever folk on Moz can help!!! Here is couple questions that would be nice to clarify on. What is the actual address/name of file for news xml. Can xml site maps be generated on request? Consider following scenario: spider requests http://mypage.com/sitemap.xml which permanently redirects to extensionless MVC 4 page http://mypage.com/sitemapxml/ . This page generates xml. Thank you, Amelia
Intermediate & Advanced SEO | | CommT0 -
Google Indexing Feedburner Links???
I just noticed that for lots of the articles on my website, there are two results in Google's index. For instance: http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html and http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+thewebhostinghero+(TheWebHostingHero.com) Now my Feedburner feed is set to "noindex" and it's always been that way. The canonical tag on the webpage is set to: rel='canonical' href='http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html' /> The robots tag is set to: name="robots" content="index,follow,noodp" /> I found out that there are scrapper sites that are linking to my content using the Feedburner link. So should the robots tag be set to "noindex" when the requested URL is different from the canonical URL? If so, is there an easy way to do this in Wordpress?
Intermediate & Advanced SEO | | sbrault740 -
.htaccess files
I am working with a clients website which has multiple htaccess files (.htaccess , .htaccess.holiding, and .htaccess.live -all in the same directory) My question is how does a server process these files? All 3 files? Currently the domain has 301 redirect showing for the home page to the mobile site (which is a problem) in one of the files (.htaccess but not others) Has anyone come across this before with regard to SEO problems?
Intermediate & Advanced SEO | | OnlineAssetPartners0 -
Sitemap - % of URL's in Google Index?
What is the average % of links from a sitemap that are included in the Google index? Obviously want to aim for 100% of the sitemap urls to be indexed, is this realistic?
Intermediate & Advanced SEO | | stats440 -
Most Painless way of getting Duff Pages out of SE's Index
Hi, I've had a few issues that have been caused by our developers on our website. Basically we have a pretty complex method of automatically generating URL's and web pages on our website, and they have stuffed up the URL's at some point and managed to get 10's of thousands of duff URL's and pages indexed by the search engines. I've now got to get these pages out of the SE's indexes as painlessly as possible as I think they are causing a Panda penalty. All these URL's have an addition directory level in them called "home" which should not be there, so I have: www.mysite.com/home/page123 instead of the correct URL www.mysite.com/page123 All these are totally duff URL's with no links going to them, so I'm gaining nothing by 301 redirects, so I was wondering if there was a more painless less risky way of getting them all out the indexes (IE after the stuff up by our developers in the first place I'm wary of letting them loose on 301 redirects incase they cause another issue!) Thanks
Intermediate & Advanced SEO | | James770 -
Has anyone found a way to get site links in the SERPs?
I am wanting to get some site links in the serps to increase the size of my "space", has anyone found a way of getting them? I know google says that its automatic and only generated if they feel it would benifit browsers but there must be a rule of thumb to follow. I was thinking down the line of a tight catagorical system that is implimented throughout the site that is clearly related to the content (how it should be I guess)... Any comments, suggestions welcome
Intermediate & Advanced SEO | | CraigAddyman0