Robots.txt
-
www.mywebsite.com**/details/**home-to-mome-4596
www.mywebsite.com**/details/**home-moving-4599
www.mywebsite.com**/details/**1-bedroom-apartment-4601
www.mywebsite.com**/details/**4-bedroom-apartment-4612 We have so many pages like this, we do not want to Google crawl this pages
So we added the following code to Robots.txt
User-agent: Googlebot
Disallow: /details/
This code is correct?
-
and this disallows everything in the /details/ folder so if there were some exceptions to the rule (some pages or sub folders in that folder) you would need to add some allow directives, or make a more specific disallow(s)
-
Looks correct. Great reccomendation by Chris. The * (wildcard) just means "all".
-
Looks good but you only want to exclude google from crawling those pages? If you want to exclude other search engines, you could use:
User-agent: *
instead of
User-agent: Googlebot
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Confirming Robots.txt code deep Directories
Just want to make sure I understand exactly what I am doing If I place this in my Robots.txt Disallow: /root/this/that By doing this I want to make sure that I am ONLY blocking the directory /that/ and anything in front of that. I want to make sure that /root/this/ still stays in the index, its just the that directory I want gone. Am I correct in understanding this?
Technical SEO | | cbielich0 -
Using Robots.txt
I want to Block or prevent pages being accessed or indexed by googlebot. Please tell me if googlebot will NOT Access any URL that begins with my domain name, followed by a question mark,followed by any string by using Robots.txt below. Sample URL http://mydomain.com/?example User-agent: Googlebot Disallow: /?
Technical SEO | | semer0 -
Does Bing ignore robots txt files?
Bonjour from "Its a miracle is not raining" Wetherby Uk 🙂 Ok here goes... Why despite a robots text file excluding indexing to site http://lewispr.netconstruct-preview.co.uk/ is the site url being indexed in Bing bit not Google? Does bing ignore robots text files or is there something missing from http://lewispr.netconstruct-preview.co.uk/robots.txt I need to add to stop bing indexing a preview site as illustrated below. http://i216.photobucket.com/albums/cc53/zymurgy_bucket/preview-bing-indexed.jpg Any insights welcome 🙂
Technical SEO | | Nightwing0 -
Impact of "restricted by robots" crawler error in WT
I have been wondering about this for a while now with regards to several of my sites. I am getting a list of pages that I have blocked in the robots.txt file. If I restrict Google from crawling them, then how can they consider their existence an error? In one case, I have even removed the urls from the index. And do you have any idea of the negative impact associated with these errors. And how do you suggest I remedy the situation. Thanks for the help
Technical SEO | | phogan0 -
Robots.txt question
What is this robots.txt telling the search engines? User-agent: * Disallow: /stats/
Technical SEO | | DenverKelly0 -
Robots.txt usage
Hey Guys, I am about make an important improvement to our site's robots.txt we have large number of properties on our site and we have different views for them. List, gallery and map view. By default list view shows up and user can navigate through gallery view. We donot want gallery pages to get indexed and want to save our crawl budget for more important pages. this is one example of our site: http://www.holiday-rentals.co.uk/France/r31.htm When you click on "gallery view" URL of this site will remain same in your address bar: but when you mouse over the "gallery view" tab it will show you URL with parameter "view=g". there are number of parameters: "view=g, view=l and view=m". http://www.holiday-rentals.co.uk/France/r31.htm?view=l http://www.holiday-rentals.co.uk/France/r31.htm?view=g http://www.holiday-rentals.co.uk/France/r31.htm?view=m Now my question is: I If restrict bots by adding "Disallow: ?view=" in our robots.txt will it effect the list view too? Will be very thankful if yo look into this for us. Many thanks Hassan I will test this on some other site within our network too before putting it to important one's. to measure the impact but will be waiting for your recommendations. Thanks
Technical SEO | | holidayseo0 -
Client accidently blocked entire site with robots.txt for a week
Our client was having a design firm do some website development work for them. The work was done on a staging server that was blocked with a robots.txt to prevent duplicate content issues. Unfortunately, when the design firm made the changes live, they also moved over the robots.txt file, which blocked the good, live site from search for a full week. We saw the error (!) as soon as the latest crawl report came in. The error has been corrected, but... Does anyone have any experience with a snafu like this? Any idea how long it will take for the damage to be reversed and the site to get back in the good graces of the search engines? Are there any steps we should take in the meantime that would help to rectify the situation more quickly? Thanks for all of your help.
Technical SEO | | pixelpointpress0 -
Subdomain Robots.txt
If I have a subdomain (a blog) that is having tags and categories indexed when they should not be, because they are creating duplicate content. Can I block them using a robots.txt file? Can I/do I need to have a separate robots file for my subdomain? If so, how would I format it? Do I need to specify that it is a subdomain robots file, or will the search engines automatically pick this up? Thanks!
Technical SEO | | JohnECF0