I want to block search bots in crawling all my website's pages expect for homepage. Is this rule correct?
-
User-agent: *
Disallow: /*
-
-
Thanks Matt! I will surely test this one.
-
Thanks David! Will try this one.
-
Use this:
User-agent: Googlebot
Noindex: /User-agent: Googlebot
Disallow: /User-agent: *
Disallow: /This is what I use to block our dev sites from being indexed and we've had no issues.
-
Actually, there are two regex that Robots can handle - asterisk and $.
You should test this one. I think it will work (about 95% sure - tested in WMT quickly):
User-agent: *
Disallow: /
Allow: /$ -
I don't think that will work. Robots.txt doesn't handle regular expressions. You will have to explicitly list all of the folders, and files to be super sure, that nothing is indexed unless you want it to be found.
This is kind of an odd question. I haven't thought about something like this in a while. I usually want everything but a couple folders indexed. : ) I found something that may be a little more help. Try reading this.
If you're working with extensions, you can use **Disallow:/*.html$ **or php or what have you. That may get you closer to a solution.
Definitely test this with a crawler that obeys robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical URL's For Two Domains
We have two websites, one we use for Google PPC (website 1) and one (website 2) we use for everything else. The reason is we are in an industry that Google Adwords doesn't like, so we built a whole other website that removes the product descriptions as Google Adwords doesn't approve of many of them (nutrition). Right now we have that Google Adwords approved website (website 1) no-index/no-follow because we didn't want to run into potential duplicate content issues in free search, but the issue is we can't submit it to Google Shopping...as they require it to be indexable. Do you think removing the no-index/no-follow from that website 1 and adding canonical URL's pointing to website 2 would resolve this issue (being able to submit it to Google Shopping) and not cause any problems with duplicate content? I was thinking of adding the canonical tag to all pages of website 1 and point it to website 2. Does that make sense? Do you think that would work?
Intermediate & Advanced SEO | | vetofunk0 -
Blacklisted website no longer blacklisted, but will not appear on Google's search engine.
We have a client who before us, had a website that was blacklisted by Google. After we created their new website, we submitted an appeal through Google's Webmaster Tools, and it was approved. One year later, they are still unable to rank for anything on Google. The keyword we are attempting to rank for on their home page is "Day in the Life Legal Videos" which shouldn't be too difficult to rank for after a year. But their website cannot be found. What else can we do to repair this previously blacklisted website after we're already been approved by Google? After doing a link audit, we found only one link with a spam score of 7, but I highly doubt that is what is causing this website to no longer appear on Google. Here is the website in question: https://www.verdictvideos.com/
Intermediate & Advanced SEO | | rodneywarner0 -
Review website - multiple pages of reviews of same item
Hi - I've been looking at review websites like tripadvisor - I looked at Tripadvisor's hotel reviews and the reviews may extend for several pages, yet they don't change title tags and so on, on each of the multiple pages? Surely it would be a good idea, from an SEO perspective, to change title tags (and perhaps other tags) for each of the multiple pages - even if the change was slight - e.g. "The White Swan Hotel - Reviews p 1" - "The White Swan Hotel - Reviews p 2" and so on? Am I missing something?
Intermediate & Advanced SEO | | McTaggart0 -
If it's not in Webmaster Tools, is it Duplicate Title
I am showing a lot of errors in my SEOmoz reports for duplicate content and duplicate titles, many of which appear to be related to capitalization vs non-capitalization in the URL. Case in point, if a URL contains a lower character, such as: http://www.gallerydirect.com/art/product/allyson-krowitz/distinct-microstructure-i as opposed to the same URL having an upper character in the structure: http://www.gallerydirect.com/art/product/allyson-krowitz/distinct-microstructure-I I am finding that some of the internal links on the site use the former structure and other links use the latter structure. These show as duplicate title/content in the SEOmoz reports, but they don't appear as duplicate titles in Webmaster Tools. My question is, should I try to work with our developers to create a script to change all of the content with cap letters in the destination links internally on the site, or is this a non-issue since it doesn't appear in Webmaster Tools?
Intermediate & Advanced SEO | | sbaylor0 -
If other websites implement our RSS feed sidewide on there website, can that hurt our own website?
Think about the switching anchors from the backlinks and the 100s of sidewide inlinks... I gues Google will understand that it's just a RSS feed right?
Intermediate & Advanced SEO | | Zanox0 -
Robots.txt file - How to block thosands of pages when you don't have a folder path
Hello.
Intermediate & Advanced SEO | | Unity
Just wondering if anyone has come across this and can tell me if it worked or not. Goal:
To block review pages Challenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236 So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only. Question:
If I add the following to the Robots.txt file will it block all review pages? User-agent: *
Disallow: /default.aspx?z=review Much thanks,
Davinia0 -
How to find all of a website's SERPs?
Was wondering how easiest to find all of a website's existing SERPs?
Intermediate & Advanced SEO | | McTaggart0 -
How are pages ranked when using Google's "site:" operator?
Hi, If you perform a Google search like site:seomoz.org, how are the pages displayed sorted/ranked? Thanks!
Intermediate & Advanced SEO | | anthematic0