Robots.txt
-
Hi All
Having a robots.txt looking like the below will this stop Google crawling the site
User-agent: *
-
If that is the only line in your robots.txt file then it really shouldn't accomplish anything. It's like saying, "Hey...all search engines...take note of this....oh forget it, there's nothing to see here."
I agree with Dave...try to fetch the page in Webmaster tools (Google Search Console). You can also use the Webmaster Tools robots.txt tester which often will tell you if there are issues.
- Hi this is what we thought but Google has not indexed any pages
How old is the site? It can take weeks for a new site to get indexed and then to get ranked as well. Do you see any pages on a site: search for your domain? (i.e. site:example.com). This might sound silly, but are you sure that there is no noindex tag on the page?
-
Via Search Console try to "Fetch As Google" and assuming that works without errors use the submit function. You'll know very quickly whether you've got technical issues and get the page into the index very quickly.
-
Hey, throw us a link to your robots.txt file and we can take a look, probably tell you pretty quickly. Without seeing it, we're all pretty much just taking guesses.
-
David's spot on. The User-agent: * mean this section applies to all robots. If you want Google (or any robot) to index your whole site, no need for a robots.txt file.
-
From your original post I presumed you had not wanted Google to index your pages?
If you want Google to index your pages it can take some time to happen naturally. You might want to submit a sitemap and ask Google to crawl your site within Webmaster Tools.
Robots.txt is normally only used to block crawlers, so you will not need to put any code in there for it to allow Google to crawl.
-
Hi this is what we thought but Google has not indexed any pages
-
Wouldn't have thought so, you'd need to include this line of code as well:
Disallow: /
That will stop anything from crawling the site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt Syntax for Dynamic URLs
I want to Disallow certain dynamic pages in robots.txt and am unsure of the proper syntax. The pages I want to disallow all include the string ?Page= Which is the proper syntax?
Technical SEO | | btreloar
Disallow: ?Page=
Disallow: ?Page=*
Disallow: ?Page=
Or something else?0 -
Why is robots.txt blocking URL's in sitemap?
Hi Folks, Any ideas why Google Webmaster Tools is indicating that my robots.txt is blocking URL's linked in my sitemap.xml, when in fact it isn't? I have checked the current robots.txt declarations and they are fine and I've also tested it in the 'robots.txt Tester' tool, which indicates for the URL's it's suggesting are blocked in the sitemap, in fact work fine. Is this a temporary issue that will be resolved over a few days or should I be concerned. I have recently removed the declaration from the robots.txt that would have been blocking them and then uploaded a new updated sitemap.xml. I'm assuming this issue is due to some sort of crossover. Thanks Gaz
Technical SEO | | PurpleGriffon0 -
'External nofollow' in a robots meta tag? (advertorial links)
I believe this has never worked? It'd be an easy way of preventing any penalties from Google's recent crackdown on paid links via advertorials. When it's not possible to nofollow each external link individually, what are people doing? Nofollowing and/or noindexing the whole page?
Technical SEO | | Alex-Harford0 -
Site blocked by robots.txt and 301 redirected still in SERPs
I have a vanity URL domain that 301 redirects to my main site. That domain does have a robots.txt to disallow the entire site as well. However, for a branded enough search that vanity domain still shows up in SERPs and has the new Google message of: A description for this result is not available because of this site's robots.txt I get why the message is there - that's not my , my question is shouldn't a 301 redirect trump this domain showing in SERPs, ever? Client isn't happy about it showing at all. How can I get the vanity domain out of the SERPs? THANKS in advance!
Technical SEO | | VMLYRDiscoverability0 -
Magento Robots & overly dynamic URL-s
How can i block all URL-s on a Magento store that have 2 or more dynamic parameters in it, since all the parameters have attribute name in it and not some uniform ID Would something like: Disallow: /?&* work? Since the only thing that is constant throughout all the custom parameters is that they are separated with "&" Thanks 🙂
Technical SEO | | tilenkrivec0 -
Disallow: /search/ in robots but soft 404s are still showing in GWT and Google search?
Hi guys, I've already added the following syntax in robots.txt to prevent search engines in crawling dynamic pages produce by my website's search feature: Disallow: /search/. But soft 404s are still showing in Google Webmaster Tools. Do I need to wait(it's been almost a week since I've added the following syntax in my robots.txt)? Thanks, JC
Technical SEO | | esiow20130 -
Oh no googlebot can not access my robots.txt file
I just receive a n error message from google webmaster Wonder it was something to do with Yoast plugin. Could somebody help me with troubleshooting this? Here's original message Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. Recommended action If the site error rate is 100%: Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot. If your robots.txt is a static page, verify that your web service has proper permissions to access the file. If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure. If the site error rate is less than 100%: Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors. The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website. After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site.
Technical SEO | | BistosAmerica0 -
How to add a disclaimer to a site but keep the content accessible to search robots?
Hi, I have a client with a site regulated by the UK FSA (Financial Services Authority). They have to display a disclaimer which visitor must accept before browsing. This is for real, not like the EU cookie compliance debacle 🙂 Currently the site 302 redirects anyone not already cookied (as having accepted) to a disclaimer page/form. Do you have any suggestions or examples of how to require acceptance while maintaining accessibility? I'm not sure just using a jquery lightbox would meet the FSA's requirements, as it wouldn't be shown if JS was not enabled. Thanks, -Jason
Technical SEO | | GroupM_APAC0