Is there a way to keep sitemap.xml files from getting indexed?
-
Wow, I should know the answer to this question.
Sitemap.xml files have to be accessible to the bots for indexing they can't be disallowed in robots.txt and can't block the folder at the server level.
So how can you allow the bots to crawl these xml pages but have them not show up in google's index when doing a site: command search, or is that even possible? Hmmm
-
ahhh no index in .htaccess file. brilliant - thanks!
-
Usually you would need to add noindex to the meta robots tag in the of your web page. Because your sitemap is an XML file and not HTML you will need to do things differently.
You can add the code below to your .htaccess file, which you can find in the ROOT folder of your server. Open the file in a plain text editor and insert the following:
Header set X-Robots-Tag "noindex"
This will stop search engines indexing your sitemap without restricting them from crawling it.
Note: Replace 'sitemap.xml' with your file name - if different.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way of crawling my entire site to get a list of NoFollow links?
Hi all, hope somebody can help. I want to crawl my site to export an audit showing: All nofollow links (what links, from which pages) All external links broken down by follow/nofollow. I had thought Moz would do it, but that's not in Crawl info. So I thought Screaming Frog would do it, but unless I'm not looking in the right place, that only seems to provide this information if you manually click down each link and view "Inlinks" details. Surely this must be easy?! Hope someone can nudge me in the right direction... Thanks....
Intermediate & Advanced SEO | | rl_uk0 -
Can you index a Google doc?
We have updated and added completely new content to our state pages. Our old state content is sitting in a our Google drive. Can I make these public to get them indexed and provide a link back to our state pages? In theory it sounds like a great link building strategy... TIA!
Intermediate & Advanced SEO | | LindsayE1 -
What should my main sitemap URL be?
Hi Mozzers - regarding the URL of a website's main website: http://example.com/sitemap.xml is the normal way of doing it but would it matter if I varied this to: http://example.com/mainsitemapxml.xml or similar? I can't imagine it would matter but I have never moved away from the former before - and one of my clients doesn't want to format the URL in that way. What the client is doing is actually quite interesting - they have the main sitemap: http://example.com/sitemap.xml - that redirects to the sitemap file which is http://example.com/sitemap (with no xml extension) - might that redirect and missing xml extension the redirected to sitemap cause an issue? Never come across such a setup before. Thanks in advance for your feedback - Luke
Intermediate & Advanced SEO | | McTaggart0 -
Images Sitemap GWT - not indexed?
So we went ahead and created an image sitemap of 2387 images, one for each product - I was hoping it would give us better exposure in image results. No joy, over 7 days and they only showing as "sent" but not "indexed". Any ideas?
Intermediate & Advanced SEO | | bjs20100 -
Moving from a static HTML CSS site with .html files to a Wordpress Site while keeping link structure
Mozzers, Hope this finds you well. I need some advice. We have a site built with a dreamweaver template, and it is lacking in responsiveness, ease of updates, and a lot of the coding is behind traditional web standards (which I know will start to hurt our rank - if not the user experience). For SEO purposes, we would like to move the existing static based site to Wordpress so we can update it easily and keep content fresh. Our current site, thriveboston.com, has a lot of page extensions ending in .html. For the transition, it is extremely important for us to keep the link structure. We rank well in the SERPs for Boston Counseling, etc... I found and tested a plugin (offline) that can add a .html extension to Wordpress pages, which allows us to keep our current structure, but has anyone had any luck with this live? Has anyone had any luck moving from a static site - to a Wordpress site - while keeping the current link structure - without hurting any rank? We hope to move soon because if the site continues to grow, it will become even harder to migrate the site over. Also, does anyone have any hesitations? It this a bad move? Should we just stay on the current DWT template (the HTML and CSS) and not migrate? Any suggestions and advice will be heeded. Thanks Mozzers!
Intermediate & Advanced SEO | | _Thriveworks0 -
How do you de-index and prevent indexation of a whole domain?
I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks
Intermediate & Advanced SEO | | Martin_S0 -
.htaccess files
I am working with a clients website which has multiple htaccess files (.htaccess , .htaccess.holiding, and .htaccess.live -all in the same directory) My question is how does a server process these files? All 3 files? Currently the domain has 301 redirect showing for the home page to the mobile site (which is a problem) in one of the files (.htaccess but not others) Has anyone come across this before with regard to SEO problems?
Intermediate & Advanced SEO | | OnlineAssetPartners0 -
Removing a Page From Google index
We accidentally generated some pages on our site that ended up getting indexed by google. We have corrected the issue on the site and we 404 all of those pages. Should we manually delete the extra pages from Google's index or should we just let Google figure out that they are 404'd? What the best practice here?
Intermediate & Advanced SEO | | dbuckles0