Robots.txt
-
www.mywebsite.com**/details/**home-to-mome-4596
www.mywebsite.com**/details/**home-moving-4599
www.mywebsite.com**/details/**1-bedroom-apartment-4601
www.mywebsite.com**/details/**4-bedroom-apartment-4612 We have so many pages like this, we do not want to Google crawl this pages
So we added the following code to Robots.txt
User-agent: Googlebot
Disallow: /details/
This code is correct?
-
and this disallows everything in the /details/ folder so if there were some exceptions to the rule (some pages or sub folders in that folder) you would need to add some allow directives, or make a more specific disallow(s)
-
Looks correct. Great reccomendation by Chris. The * (wildcard) just means "all".
-
Looks good but you only want to exclude google from crawling those pages? If you want to exclude other search engines, you could use:
User-agent: *
instead of
User-agent: Googlebot
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do robot.txts permanently affect websites even after they have been removed?
A client has a Wordpress blog to sit alongside their company website. They kept it hidden whilst they were developing what it looked like, keeping it un-searchable by Search Engines. It was still live, but Wordpress put a robots.txt in place. When they were ready they removed the robots.txt by clicking the "allow Search Engines to crawl this site" button. It took a month and a half for their blog to show in Search Engines once the robot.txt was removed. Google is now recognising the site (as a "site:" test has shown) however, it doesn't rank well for anything. This is despite the fact they are targeting keywords with very little organic competition. My question is - could the fact that they developed the site behind a robot.txt (rather than offline) mean the site is permanently affected by the robot.txt in the eyes of the Search Engines, even after that robot.txt has been removed? Thanks in advance for any light you can shed on the situation.
Technical SEO | | Driver720 -
2 sitemaps on my robots.txt?
Hi, I thought that I just could link one sitemap from my site's robots.txt but... I may be wrong. So, I need to confirm if this kind of implementation is right or wrong: robots.txt for Magento Community and Enterprise ...
Technical SEO | | Webicultors
Sitemap: http://www.mysite.es/media/sitemap/es.xml
Sitemap: http://www.mysite.pt/media/sitemap/pt.xml Thanks in advance,0 -
Blocked jquery in Robots.txt, Any SEO impact?
I've heard that Google is now indexing links and stuff available in javascript and jquery. My webmastertools is showing that some links are blocked in robots.txt of jquery. Sorry I'm not a developer or designer. I want to know is there any impact of this on my SEO? and also how can I unblock it for the robots? Check this screenshot: http://i.imgur.com/3VDWikC.png
Technical SEO | | hammadrafique0 -
Robots.txt anomaly
Hi, I'm monitoring a site thats had a new design relaunch and new robots.txt added. Over the period of a week (since launch) webmaster tools has shown a steadily increasing number of blocked urls (now at 14). In the robots.txt file though theres only 12 lines with the disallow command, could this be occurring because a line in the command could refer to more than one page/url ? They all look like single urls for example: Disallow: /wp-content/plugins
Technical SEO | | Dan-Lawrence
Disallow: /wp-content/cache
Disallow: /wp-content/themes etc, etc And is it normal for webmaster tools reporting of robots.txt blocked urls to steadily increase in number over time, as opposed to being identified straight away ? Thanks in advance for any help/advice/clarity why this may be happening ? Cheers Dan0 -
Site blocked by robots.txt and 301 redirected still in SERPs
I have a vanity URL domain that 301 redirects to my main site. That domain does have a robots.txt to disallow the entire site as well. However, for a branded enough search that vanity domain still shows up in SERPs and has the new Google message of: A description for this result is not available because of this site's robots.txt I get why the message is there - that's not my , my question is shouldn't a 301 redirect trump this domain showing in SERPs, ever? Client isn't happy about it showing at all. How can I get the vanity domain out of the SERPs? THANKS in advance!
Technical SEO | | VMLYRDiscoverability0 -
Is having no robots.txt file the same as having one and allowing all agents?
The site I am working on currently has no robots.txt file. However, I have just uploaded a sitemap and would like to point the robots.txt file to it. Once I upload the robots.txt file, if I allow access to all agents, is this the same as when the site had no robots.txt file at all; do I need to specify crawler access on can the robots.txt file just contain the link to the sitemap?
Technical SEO | | pugh0 -
How to allow one directory in robots.txt
Hello, is there a way to allow a certain child directory in robots.txt but keep all others blocked? For instance, we've got external links pointing to /user/password/, but we're blocking everything under /user/. And there are too many /user/somethings/ to just block every one BUT /user/password/. I hope that makes sense... Thanks!
Technical SEO | | poolguy0 -
Robots.txt usage
Hey Guys, I am about make an important improvement to our site's robots.txt we have large number of properties on our site and we have different views for them. List, gallery and map view. By default list view shows up and user can navigate through gallery view. We donot want gallery pages to get indexed and want to save our crawl budget for more important pages. this is one example of our site: http://www.holiday-rentals.co.uk/France/r31.htm When you click on "gallery view" URL of this site will remain same in your address bar: but when you mouse over the "gallery view" tab it will show you URL with parameter "view=g". there are number of parameters: "view=g, view=l and view=m". http://www.holiday-rentals.co.uk/France/r31.htm?view=l http://www.holiday-rentals.co.uk/France/r31.htm?view=g http://www.holiday-rentals.co.uk/France/r31.htm?view=m Now my question is: I If restrict bots by adding "Disallow: ?view=" in our robots.txt will it effect the list view too? Will be very thankful if yo look into this for us. Many thanks Hassan I will test this on some other site within our network too before putting it to important one's. to measure the impact but will be waiting for your recommendations. Thanks
Technical SEO | | holidayseo0