I want to block search bots in crawling all my website's pages expect for homepage. Is this rule correct?
-
User-agent: *
Disallow: /*
-
-
Thanks Matt! I will surely test this one.
-
Thanks David! Will try this one.
-
Use this:
User-agent: Googlebot
Noindex: /User-agent: Googlebot
Disallow: /User-agent: *
Disallow: /This is what I use to block our dev sites from being indexed and we've had no issues.
-
Actually, there are two regex that Robots can handle - asterisk and $.
You should test this one. I think it will work (about 95% sure - tested in WMT quickly):
User-agent: *
Disallow: /
Allow: /$ -
I don't think that will work. Robots.txt doesn't handle regular expressions. You will have to explicitly list all of the folders, and files to be super sure, that nothing is indexed unless you want it to be found.
This is kind of an odd question. I haven't thought about something like this in a while. I usually want everything but a couple folders indexed. : ) I found something that may be a little more help. Try reading this.
If you're working with extensions, you can use **Disallow:/*.html$ **or php or what have you. That may get you closer to a solution.
Definitely test this with a crawler that obeys robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I want to use a photo from an official website for my own website.IF YES HOW?
Lets suppose i downloaded a photo from a XYZ website and want to use it on my own website, and also i want to rank for same keyword, and would like to rank just below XYZ site, i know there could be copyright issue. what can be done to avoid this issue. Can i manipulate the picture in a such way that it is usable. if yes how? How can i use that official websites picture for my website, i mean, can i cite that website as a source? what is the best practice in this case? i dont want to use stock photo,i really like xyz sites pics.
Intermediate & Advanced SEO | | Sam09schulz0 -
What’s the best way to handle multiple website languages in terms of metatags that should be used and pages sent on our sitemap?
Hey everyone, Has anyone here worked with SEO + website translations? When should we use canonical or alternate tag if we want the user to find our page on the language he used on Google? Should we send all pages on all the different locales on the sitemap? Looking forward to hearing from you! Thanks!
Intermediate & Advanced SEO | | allanformigoni0 -
Website URL Structure - keyword targeting on homepage vs internal pages
I have developed a few websites before where the homepage contains the content for the keywords I was targeting. This has been reasonably successful as I have found it easy enough to get links to the homepage. I am considering a new site in a totally different industry that I am thinking about structuring like this: mybrand.com (not necessarily targeting any keywords) mybrand.com/important-keyword-1/ (definitely want to target) mybrand.com/important-keyword-2 (equally important as 1st keyword) There will be several (30-ish) other pages targeting keywords but they are not as significant as the two mentioned above, more so they are about publishing informative information. The two important keywords are quite different but industry related. My questions are: should I be careful targeting keywords away from the homepage when the homepage gets the most links? Would I be better off building 2 different websites where the keyword content is captured in the homepage? Thanks,
Intermediate & Advanced SEO | | BGu0 -
Duplicate Page Content Errors on Moz Crawl Report
Hi All, I seem to be losing a 'firefighting' battle with regards to various errors being reported on the Moz crawl report relating to; Duplicate Page Content Missing Page Title Missing Meta Duplicate Page Title While I acknowledge that some of the errors are valid (and we are working through them), I find some of them difficult to understand... Here is an example of a 'duplicate page content' error being reported; http://www.bolsovercruiseclub.com (which is obviously our homepage) Is reported to have 'duplicate page content' compared with the following pages; http://www.bolsovercruiseclub.com/guides/gratuities http://www.bolsovercruiseclub.com/cruise-deals/cruise-line-deals/holland-america-2014-offers/?order_by=brochure_lead_difference http://www.bolsovercruiseclub.com/about-us/meet-the-team/craig All 3 of those pages are completely different hence my confusion... This is just a solitary example, there are many more! I would be most interested to hear what people's opinions are... Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
How do I get rel='canonical' to eliminate the trailing slash on my home page??
I have been searching high and low. Please help if you can, and thank you if you spend the time reading this. I think this issue may be affecting most pages. SUMMARY: I want to eliminate the trailing slash that is appended to my website. SPECIFIC ISSUE: I want www.threewaystoharems.com to showing up to users and search engines without the trailing slash but try as I might it shows up like www.threewaystoharems.com/ which is the canonical link. WHY? and I'm concerned my back-links to the link without the trailing slash will not be recognized but most people are going to backlink me without a trailing slash. I don't want to loose linkjuice from the people and the search engines not being in consensus about what my page address is. THINGS I"VE TRIED: (1) I've gone in my wordpress settings under permalinks and tried to specify no trailing slash. I can do this here but not for the home page. (2) I've tried using the SEO by yoast to set the canonical page. This would work if I had a static front page, but my front page is of blog posts and so there is no advanced page settings to set the canonical tag. (3) I'd like to just find the source code of the home page, but because it is CSS, I don't know where to find the reference. I have gone into the css files of my wordpress theme looking in header and index and everywhere else looking for a specification of what the canonical page is. I am not able to find it. I'm thinking it is actually specified in the .htaccess file. (4) Went into cpanel file manager looking for files that contain Canonical. I only found a file called canonical.php . the only thing that seemed like it was worth changing was changing line 139 from $redirect_url = home_url('/'); to $redirect_url = home_url(''); nothing happened. I'm thinking it is actually specified in the .htaccess file. (5) I have gone through the .htaccess file and put thes 4 lines at the top (didn't redirect or create the proper canonical link) and then at the bottom of the file (also didn't redirect or create the proper canonical link) : RewriteEngine on
Intermediate & Advanced SEO | | Dillman
RewriteCond %{HTTP_HOST} ^([a-z.]+)?threewaystoharems.com$ [NC]
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule .? http://www.%1threewaystoharems.com%{REQUEST_URI} [R=301,L] Please help friends.0 -
Do Q&A 's work for SEO
If I create a good community in my particular field on my SEO site and have a quality Q&A section like this etc (ripping of MOZ's idea here sorry, I hope it's ok) will the long term returns be worth the effort of creating and man ageing this. Is the user created content of as much use as I think it will be?
Intermediate & Advanced SEO | | mark_baird0 -
Restructuring Menu's
Hi all I am running my site on Wordpress using a slightly modified them from Studiopress on the Genisis frame work. I am extremely over my head but alas until I get some revenue SEO and Design are all on me. I do not know HTML or CSS but I do follow directions well (unless you ask my wife). Disclaimer out of the way I have some questions. I would like to change up my menu's to be more on the line of Products | Services | About Us | Contact Us | Blog Listing various direct mail pieces under Products, Sevices and so on and so forth. I wonder does this mean I will have to figure out how to write 301's and other complicated things or can I just make the changes. I think but might be wrong that this will change the URL's. Any advice before I mess this up would be greatly helpful. My site is http://www.roiautosolutions.com. If you want a few laughs about the car business read the 2 most recent blog post, anything before that and my writing style is pretty boring. Thanks, Mark Hilger
Intermediate & Advanced SEO | | mhilger0 -
Most Painless way of getting Duff Pages out of SE's Index
Hi, I've had a few issues that have been caused by our developers on our website. Basically we have a pretty complex method of automatically generating URL's and web pages on our website, and they have stuffed up the URL's at some point and managed to get 10's of thousands of duff URL's and pages indexed by the search engines. I've now got to get these pages out of the SE's indexes as painlessly as possible as I think they are causing a Panda penalty. All these URL's have an addition directory level in them called "home" which should not be there, so I have: www.mysite.com/home/page123 instead of the correct URL www.mysite.com/page123 All these are totally duff URL's with no links going to them, so I'm gaining nothing by 301 redirects, so I was wondering if there was a more painless less risky way of getting them all out the indexes (IE after the stuff up by our developers in the first place I'm wary of letting them loose on 301 redirects incase they cause another issue!) Thanks
Intermediate & Advanced SEO | | James770