Robots.txt file
-
Does it serve any purpose if we omit robots.txt file ? I wonder if spider has to read all the pages, why do we insert robots.txt file ?
-
As Ryan said, robots.txt file is very useful when you wanna block (disallow) some pages. Indeed, if you don't want that spider crawls your page you must use robots.txt (noindex tags will let bot crawls, but not index, your page). I have got a small website but i dropped robots.txt in my folder. Maybe write just Allow: / could be useless, but you can say: "I respect protocols"
-
A good source to learn about the robots.txt file is here: http://www.robotstxt.org/
The robots.txt file is completely optional. I don't use the file at all on small sites.
The file offers a means to block crawlers which choose to honor the file's instructions from crawling all or part of a site. It also provides the location of a sitemap.
To that end, sitemaps are completely unnecessary for SEO assuming your site has proper navigation. Even if you choose to use a sitemap, you can offer the location via WMT rather then the robots.txt file.
With respect to blocking areas of your site, the primary use would be for CMS, forums, ecommerce or other sites where the software was limited and does not allow the site owner to use noindex on all pages.
As a rule, robots.txt should simply never be used except as a means of last resort. In my experience the file is overused by site owners and SEOs. One exception where I use a robots.txt is during a site's development when I do not wish the site to be crawled at all.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Redirects - Large .htaccess file question
We are moving about 5000 pages from root into different folders. We need to individually 301 each page because the are sitting at root level now: mysite.com/page.com We want to move them to: mysite.com/folder/page.html etc I dont think redirect match can works because of the different files names and folders they are being moved in to. Will 5000 entries in .htacess slow site loading? Any other suggestions how to handle?
On-Page Optimization | | leadforms0 -
Can Robots.txt on Root Domain override a Robots.txt on a Sub Domain?
We currently have beta sites on sub-domains of our own domain. We have had issues where people forget to change the Robots.txt and these non-relevant beta sites get indexed by search engines (nightmare). We are going to move all of these beta sites to a new domain that we disallow all in the root of the domain. If we put fully configured Robots.txt on these sub-domains (that are ready to go live and open for crawling by the search engines) is there a way for the Robots.txt in the root domain to override the Robots.txt in these sub-domains? Apologies if this is unclear. I know we can handle this relatively easy by changing the Robots.txt in the sub-domain on going live but due to a few instances where people have forgotten I want to reduce the chance of human error! Cheers, Dave.
On-Page Optimization | | davelane.verve0 -
[HELP!] File Name and ALT Tags
Hi, please answer my questions: 1. Is it okay to use the same keyword on both file name and alt tags when inserting an image? Example: File Name: buy-lego-online.jpg ALT tag: buy-lego-online Will it trigger Google Panda? Will I be penalized for that? Or the file name and alt tags should be different from each other? Because when inserting an image on Wordpress, the alt tags are always the same as the file name by default. 2. For example, I have 2 images in a page (same topic/niche) and I will put "cheap-lego-for-kids" and "best-lego-for-sale" as alt tags. Considering that I repeat the word "lego", is it considered keyword stuffing? Will I be penalized for that? Thanks in advance!
On-Page Optimization | | bubblymaiko0 -
"translation" of code in htaccess file
Hi everyone! I am a newbie to the whole SEO and html thing and I am trying to get a better understanding of the "behind the scenes" part of my website. I hope I can find someone here who can translate a piece of code for me that I have in my htaccess file: Options -Multiviews
On-Page Optimization | | momof4
Options +FollowSymLinks
rewritecond $1 !^(index.php|public|tmp|robots.txt|template.html|favicon.ico|images|css|uploads)
rewritecond %{REQUEST_FILENAME} !-f
rewritecond %{REQUEST_FILENAME} !-d
rewriterule ^(.*)$ index.php?link=$1 [NC,L,QSA] I know that something is getting redirected to the index file, but what (or when) exactly? Does the word "robots"mean that search engine crawlers are getting redirected here? And is this good or bad (in terms of SEO)? Or is this redirecting people who try to get to my robots/ template or image files?? Thanks in advance for any answers!0 -
Disavow Tool Submitting 2nd File
Hi About 2 Months ago I submitted a Disavow File using the Disavow Tool I have collected more links and I am ready to upload a 2nd File, However should I download the previous file in Webmaster Tools (Disavow Tool) and add these new links to that File or if I just upload and override the existing file with this file containing new links only will that be ok. What I dont want to do is do something that removes all the previous findings from the list so that google cannot see them anymore. I guess what I am trying to say is does Google just refer to the live file I am updating / overriding or once I have submitted a file weather i remove it or not will google still have a record of it and be referring to it ? Thanks Adam
On-Page Optimization | | AMG1000 -
Using meta robots 'noindex'
Alright, so I would consider myself a beginner at SEO. I've been doing merchandising and marketing for Ecommerce sites for about a year and a half now and am just now starting to attempt to apply some intermediate SEO techniques to the sites I work on so bear with me. We are currently redoing the homepage of our site and I am evaluating what links to have on it. I don't want to lose precious link juice to pages that don't need it, but there are certain pages that we need to have on the homepage that people just won't search for. My question is would it be a good move to add the meta robots 'noindex' tag to these pages? Is my understanding correct that if the only link on the page is back to the homepage it will pass back the linkjuice? Also, how many homepage links are too many? We have a fairly large ecommerce site with a lot of categories we'd like to feature, but don't want to overdo the homepage. I appreciate any help!
On-Page Optimization | | ClaytonKendall0 -
I have a direct question about file structure.
This question is about a new file structure and SEO friendly URL's. Does a file name make a difference? I have a direct question about file structure. Our old site was formated with a URL of http://rousechamberlin.com/about_us.aspx our new site is structured http://rousechamberlin.com/AboutUs/ no file no extension. As the SEO guy of the company and not the programmer my feeling is this is killing us. Does anybody have any thoughts on this?
On-Page Optimization | | HeadWebChef0 -
How do you block development servers with robots.txt?
When we create client websites the urls are client.oursite.com. Google is indexing theses sites and attaching to our domain. How can we stop it with robots.txt? I've heard you need to have the robots file on both the main site and the dev sites... A code sample would be groovy. Thanks, TR
On-Page Optimization | | DisMedia0