Robots.txt blocking site or not?
-
Here is the robots.txt from a client site. Am I reading this right --
that the robots.txt is saying to ignore the entire site, but the
#'s are saying to ignore the robots.txt command?See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
To ban all spiders from the entire site uncomment the next two lines:
User-Agent: *
Disallow: /
-
You are reading it correctly.
Any text prefaced by a # character is ignored. The # symbol indicates a comment.
More details are available at http://www.robotstxt.org/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking subdomains with Robots.txt file
We noticed that Google is indexing our pre-production site ibweb.prod.interstatebatteries.com in addition to indexing our main site interstatebatteries.com. Can you all help shed some light on the proper way to no-index our pre-prod site without impacting our live site?
Technical SEO | | paulwatley0 -
Timely use of robots.txt and meta noindex
Hi, I have been checking every possible resources for content removal, but I am still unsure on how to remove already indexed contents. When I use robots.txt alone, the urls will remain in the index, however no crawling budget is wasted on them, But still, e.g having 100,000+ completely identical login pages within the omitted results, might not mean anything good. When I use meta noindex alone, I keep my index clean, but also keep Googlebot busy with indexing these no-value pages. When I use robots.txt and meta noindex together for existing content, then I suggest Google, that please ignore my content, but at the same time, I restrict him from crawling the noindex tag. Robots.txt and url removal together still not a good solution, as I have failed to remove directories this way. It seems, that only exact urls could be removed like this. I need a clear solution, which solves both issues (index and crawling). What I try to do now, is the following: I remove these directories (one at a time to test the theory) from the robots.txt file, and at the same time, I add the meta noindex tag to all these pages within the directory. The indexed pages should start decreasing (while useless page crawling increasing), and once the number of these indexed pages are low or none, then I would put the directory back to robots.txt and keep the noindex on all of the pages within this directory. Can this work the way I imagine, or do you have a better way of doing so? Thank you in advance for all your help.
Technical SEO | | Dilbak0 -
Site maps, Is there any benefit?
I have a relatively simple and small site (60 pages). All of it is crawlable and there is nothing I want non follow. So, is there any real benefit to a sitemap since Google can get to all the site anyway? Do they give the site more credence or something because it's there? I guess as an aside, are there any favorite sites that will generate a site map? Thanks!
Technical SEO | | Banknotes0 -
Video submission sites
Hello, What are the top 5 sites for video submissions ? Any suggestions about which points should be taken into consideration when submitting videos ? Thanks
Technical SEO | | seoug_20050 -
Replacing a site map
We are in the process of changing our folder/url structure. Currently we have about 5 sitemaps submitted to Google. How is it best to deal with these site maps in terms of either (a) replacing the old URLs with the new ones in the site map and (b) what affect should we have if we removed the site map submission from the Google Webmaster Tools console. Basically we have in the region of 20,000 urls to redirect to the new format, and to update in the site map.
Technical SEO | | NeilTompkins0 -
Google and QnA sites
My website has a QnA site - a bit like this one except it's not private to premium members. It is a page with a left colomn for category links and it has a list of recently asked questions, each question is a link to view the full question and answers etc. Does google know this is a QnA ? Or will it say - hey, there are far too many links on this page, tut tut. Is there anything I can do to help it understand what the page is.
Technical SEO | | borderbound0 -
Robots.txt Syntax
Does the order of the robots.txt syntax matter in SEO? For example (are there potential problems with this format): User-agent: * Sitemap: Disallow: /form.htm Allow: / Disallow: /cgnet_directory
Technical SEO | | RodrigoStockebrand0 -
Does Google take user site blockings from Chrome as a spam signal?
When you perform a search in Chrome, click through to a result, then hit "back", you get a nice little option to "Block all example.com results" listed next to the result from which you backed out. I am assuming Google collects this information from Chrome users whose settings allow them to? I am assuming this is a spam signal (in aggregate)? Anyone know? Thanks!
Technical SEO | | TheEspresseo0