I want to block search bots in crawling all my website's pages expect for homepage. Is this rule correct?

esiow2013

User-agent: *

Disallow: /*

GPainter

some great answers you can also find a list of all the robots here & here

Depending on your site you can also for example hide the rest of your site behind a login screen or a form which bots won't fill in.

esiow2013

Thanks Matt! I will surely test this one.

esiow2013

Thanks David! Will try this one.

Kingof5

Use this:

User-agent: Googlebot
Noindex: /

User-agent: Googlebot
Disallow: /

User-agent: *
Disallow: /

This is what I use to block our dev sites from being indexed and we've had no issues.

MattAntonino

Actually, there are two regex that Robots can handle - asterisk and $.

You should test this one. I think it will work (about 95% sure - tested in WMT quickly):

User-agent: *
Disallow: /
Allow: /$

Travis_Bailey

I don't think that will work. Robots.txt doesn't handle regular expressions. You will have to explicitly list all of the folders, and files to be super sure, that nothing is indexed unless you want it to be found.

This is kind of an odd question. I haven't thought about something like this in a while. I usually want everything but a couple folders indexed. : ) I found something that may be a little more help. Try reading this.

If you're working with extensions, you can use **Disallow:/*.html$ **or php or what have you. That may get you closer to a solution.

Definitely test this with a crawler that obeys robots.txt.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

I want to block search bots in crawling all my website's pages expect for homepage. Is this rule correct?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Canonical URL's For Two Domains

Blacklisted website no longer blacklisted, but will not appear on Google's search engine.

Review website - multiple pages of reviews of same item

If it's not in Webmaster Tools, is it Duplicate Title

If other websites implement our RSS feed sidewide on there website, can that hurt our own website?

Robots.txt file - How to block thosands of pages when you don't have a folder path

How to find all of a website's SERPs?

How are pages ranked when using Google's "site:" operator?