How do i block an entire category/directory with robots.txt?
-
Anyone has any idea how to block an entire product category, including all the products in that category using the robots.txt file? I'm using woocommerce in wordpress and i'd like to prevent bots from crawling every single one of products urls for now.
The confusing part right now is that i have several different url structures linking to every single one of my products for example www.mystore.com/all-products, www.mystore.com/product-category, etc etc.
I'm not really sure how i'd type it into the robots.txt file, or where to place the file.
any help would be appreciated thanks
-
Thanks for the detailed answer, i will give it a try!
-
Hi
This should do it, you place the robots.txt in the root directory of your site.
User-agent: * Disallow: /product-category/
You can check out some more examples here: http://www.seomoz.org/learn-seo/robotstxt
As for the multiple urls linking to the same pages, you will just need to check all possible variants and make sure you have them covered in the robots.txt file.
Google webmaster tools has a page where you can use to check if the robots.txt file is doing what you expect it to do (under Health -> Blocked Urls).
It might be easier to block the pages with a meta tag as described in the link above if you are running a plugin allowing this, that should take care of all the different url structures also.
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should we rename and update a page or create a new page entirely?
Hi Moz Peoples! We have a small site with a simple site navigation, with only a few links on the nav bar. We have been doing some work to create a new page, which will eventually replace one of the links on the nav bar. The question we are having is, is it better to rename the existing page and replace its content and then wait for the great indexer to do its thing, or perm delete the page and replace it with the new page and content? Or is this a case where it really makes no difference as long as the redirects are set up correctly?
On-Page Optimization | | Parker8180 -
How do you find domains with good PA/DA?
I am looking for a domain with a good PA/DA but I cant find a list of relevant domains that are decent. Does anyone know any website that can be used to make the search a bit easier?
On-Page Optimization | | trobs24680 -
Optimizing a product category vs. a bespoke content page
Hi there, I work for a furniture retailer in the UK and I have a question about ranking for search phrases. Say I'm looking to rank for the keyword phrases: 'Tempur mattress' and 'Tempur mattress liverpool' and I have a category at: www.mysite.co.uk/tempur/ which list all of our mattresses, would I be better trying to optimize this page for those key phrases or would I be better generating a new page, say, www.mysite.co.uk/tempur-mattress-liverpool.html Thank you for your input.
On-Page Optimization | | Bee1590 -
Rel=canonical vs noindex/follow - tabs with individual URLs
Hi everyone I've got a situation that I haven't seen in quite this way before. I would like some advice on whether I should be rel=canonicalzing of noindexing/following a range of pages on a clients website. I've just started working on a website that creates individual URLs for tabs within each page which has resulted in several URLs being created for each listing: Example URLs: hotel-downtown-calgary hotel-downtown-calgary/gallery?tab hotel-downtown-calgary?tab hotel-downtown-calgary/map?tab hotel-downtown-calgary/facilities?tab hotel-downtown-calgary/reviews?tab hotel-downtown-calgary/in-the-area?tab Google has indexed over 1500 pages with the "?tab" parameter (there are 4380 page indexed for the site in total), and also seems to be indexing some of these pages without the "?tab" parameter i.e. ("hotel-downtown-calgary/reviews" instead of "hotel-downtown-calgary/reviews?tab") so the amount of potential duplication could be more. These tabbed pages are getting minimal traffic from organic search, so I've got no issues with taking them out of the index - the question is how. There are the issues I see: Each tab has the same title as the other tabs for each location, so lots of title duplication. Each individual tab doesn't have much content (although the content each tab has is unique). I would usually expect the tabs to be distinguished by the parameters only, not have unique URLs - if that was the case we wouldn't have a duplication issue. So the question is: rel=canonical or noindex/follow? I can see benefits of both. Looking forward to your thoughts!
On-Page Optimization | | Digitator0 -
How much copy should there be on an e-commerce category page?
I'm not looking for a precise number, obviously. I'm more interested in a general range. More text means more long-tail and synonym opportunities, but of course you don't want too much copy above the fold, pushing your products down. Maybe you can get away with a short paragraph or two at the top of the page. You can always put more copy below the products, but in a recent SEOmoz e-commerce webinar, the presenter seemed to think that was silly and unnecessary. He even suggested that the algo might intentionally ignore text below products, since it's clearly not intended to be read. What do you think?
On-Page Optimization | | CMC-SD0 -
Seeking a Google Penalty / Panda & Penguin recovery expert
Hi folks, I've been dealing with an online travel agency who came to me looking for content marketing services. One look at their analytics & GWT was all I needed to see they have some serious site cleanup to do before they can start spending money on our services. Their search traffic floored with the first Panda update back in Feb 2011, and they've gone through all sorts of turbulence with the Penguin updates too. There are no messages in GWT, but I suspect their previous SEO guy might have deleted them to cover his tracks. I did a quick look at their link profile and there's all kinds of junk in there, they have dupe content all over the place and the whole thing needs cleaning up. In other words, it's a mess. I'd like to win some business from them, but first they need to talk to a Panda/Penguin recovery specialist. If you're interested, drop me a private message and I'll put you in touch with them. Thanks, Matt.
On-Page Optimization | | MattBarker0 -
Wordpress category links not working
Hi All of sudden, my category links are not working. Any tips on figuring out what's causing this? Looks like permalink problem with newer wordpress version. I turned off all the plugins see if this cause any problems. Still not being able to find any option. Here's my site http://www.hibebefetaldoppler.com/fetal-doppler-questions-and-answers/ Thanks in advance
On-Page Optimization | | BistosAmerica0 -
Installing a site on top level domain directory VS deeper directory
How important it is for seo to install a site on a top level directory vs deeper directory? example: www.mysite.com VS www.mysite.com/catalog many eCommerce scripts such as oscommerce and cre loaded will install by default to /catalog and that's what I've been doing for most of my customer. does it mean it will be harder for them to get good seo results? Thanks in advance for the input...
On-Page Optimization | | zigi0