Robots file include sitemap
-
Hello,
I see that google, facebook and moz... have robots.txt include sitemap at the footer.
Eg: http://www.google.com.vn/robots.txtSitemap: http://www.google.com/sitemaps_webmasters.xml
Sitemap: http://www.google.com/ventures/sitemap_ventures.xml Should I include my sitemap file (sitemap.xml) at the footer of robots.txt and why should do this?Thanks,
-
Fore sure. The reason is that not all sitemap files are simply sitemap.xml it may be sitemap.gzip sitemap.zip or in my case, sitmap_index.gzip. Also, some people may not be able to include their sitemap at root.
Including sitemap exacts in robots.txt gives a clear directive to each of the search engines exactly where to find your sitemap. Google and Bing/Yahoo will have no issue finding it as you probably submitted it to them but crawlers like ask.com will usually look at your robots.txt and skip your site if no map is placed in it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can you reference http and https in a sitemap.xml?
I have a site where some pages (in Spanish) are https. The english pages are http. Can you tell me if it's okay to have both in the sitemap.xml?
On-Page Optimization | | RoxBrock0 -
Sitemap include all site links or just ones we want indexed?
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
On-Page Optimization | | Whebb0 -
Solve duplicate content issues by using robots.txt
Hi, I have a primary website and beside that I also have some secondary websites with have same contents with primary website. This lead to duplicate content errors. Because of having many URL duplicate contents, so I want to use the robots.txt file to prevent google index the secondary websites to fix the duplicate content issue. Is it ok? Thank for any help!
On-Page Optimization | | JohnHuynh0 -
I have more pages in my site map being blocked by the robot file than I have being allowed to be crawled. Is Google going to hate me for this?
Using some rules to block all pages which start with "copy-of" on my website because people have a bad habit of duplicating new product listings to create our refurbished, surplus etc. listings for those products. To avoid Google seeing these as duplicate pages I've blocked them in the robot file, but of course they are still automatically generated in our sitemap. How bad is this?
On-Page Optimization | | absoauto0 -
Wordpress categories tags and robots.txt
I am relatively new at this and see a variety of people that seem to disagree on if you should block google from indexing category and tag pages through robot.txt or no-follow because of google viewing it as duplicate content. I tryst this communities answers over the web at large obviosly, so what do you all think? Thanks, Steven
On-Page Optimization | | sfmatthews0 -
Robots.txt file
Does it serve any purpose if we omit robots.txt file ? I wonder if spider has to read all the pages, why do we insert robots.txt file ?
On-Page Optimization | | seoug_20050 -
I have a direct question about file structure.
This question is about a new file structure and SEO friendly URL's. Does a file name make a difference? I have a direct question about file structure. Our old site was formated with a URL of http://rousechamberlin.com/about_us.aspx our new site is structured http://rousechamberlin.com/AboutUs/ no file no extension. As the SEO guy of the company and not the programmer my feeling is this is killing us. Does anybody have any thoughts on this?
On-Page Optimization | | HeadWebChef0 -
The SEOmoz crawler is being blocked by robots.txt need help
SEO moz is showing me that the robot.txt is blocking content on my site
On-Page Optimization | | CGR-Creative0