How to prevent directory from being accessed by search engines?
-
Pretty much as the question says, is there any way to stop search engines from crawling a directory? I am working on a Wordpress installation for my site but don't want it to be listed in search engines until it's ready to be shown to the world. I know the simplest way is to password-protect the directory but I had some issues when I tried to implement that so I'd like to see if there's a way to do it without passwords. Thanks in advance.
-
But don't forget to remove that Disallow out of Robots.txt when you go live - if you want those pages to be indexed (and also the Meta-robots noindex nofollow).
Otherwise you might be pulling your hair out trying to figure out why none of your pages are getting indexed in the SERPs.
-
You're absolutely right! I left that part out. Thanks
-
The robots.txt file does not guarantee that your pages will not show up in search results! Your best bet after password protection is adding a NoIndex meta tag to you page headers.
Google have openly said that they obey this tag (Matt Cutts).
-
Xee,
It always help, and it is very easy to implement. This function to show the path to the sitemap ir very good.
-
It's not required to have the ending slash. At least, it works for us without it.
-
As it is, my site is just phpBB3 forums (www.bearsfansonline.com); would a sitemap really help that much?
-
If you don't have an robot.txt file, you need to include some important stuff first.
First, do you have a sitemap.xlm for your website? If not, its very important and you should creat it at: http://www.xml-sitemaps.com/
Create a robot.txt file and include the follow:
User-agent: * allow: / disallow: /directoryname
Sitemap: http://www.yousite.com/sitemap.xmlWith this you will inform all robots where is your sitemap. You should read more about robots.txt in this great post: http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts
-
shouldn't you put a slash at the end of the directory in the robots file?
you can create the robots file through the Google Webmaster Tools
-
I don't have a robots.txt file in my root. Do I just create a text file, put the above lines into it, and upload it to my root after changing the name?
-
I'm assuming you want all search engines blocked from this directory. If so, edit your robots.txt file to state the following. This will block all bots from accessing a folder/directory on your site
User-agent: *
Disallow: /directoryname
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content for Locations on my Directory Site
I have a pretty big directory site using Wordpress with lots of "locations", "features", "listing-category" etc.... Duplicate Content: https://www.thecbd.co/location/california/ https://www.thecbd.co/location/canada/ referring URL is www.thecbd.co is it a matter of just putting a canonical URL on each location, or just on the main page? Would this be the correct code to put: on the main page? Thanks Everyone!
Technical SEO | | kay_nguyen0 -
My site Has Penalized By google Search Result Without Any Spam Score.
I Recently Make a Site Gizmocombot.com. tHE aITE has NO spam Record NO lousy BACKLINK.it has all unique article can anyone tell us how we can unpenalized our site from google webmaster and google search Result. i attcead a screenshot as well yoou need. 3nzmALp
Technical SEO | | litoginamaaba3332 -
Adding /es version to google search console
I have a Wordpress site and we are using WPML for making it bilingual. The domain is: https://www.designerfreelance.net and for Spanish https://www.designerfreelance.net/es Do I have to add to Google search console the /es version? And the no www: https://www.designerfreelance.net https://www.designerfreelance.net/es https://designerfreelance.net https://designerfreelance.net/es and do I have to add the non ssl version? http://www.designerfreelance.net http://www.designerfreelance.net/es http://designerfreelance.net http://designerfreelance.net/es Thanks
Technical SEO | | Trazo0 -
Need Help On Proper Steps to Take To De-Index Our Search Results Pages
So, I have finally decided to remove our Search Results pages from Google. This is a big dealio, but our traffic has consistently been declining since 2012 and it's the only thing I can think of. So, the reason they got indexed is back in 2012, we put linked tags on our product pages, but they linked to our search results pages. So, over time we had hundreds of thousands of search results pages indexed. By tag pages I mean: Keywords: Kittens, Doggies, Monkeys, Dog-Monkeys, Kitten-Doggies Each of these would be linked to our search results pages, i.e. http://oursite.com/Search.html?text=Kitten-Doggies So, I really think these pages being indexed are causing much of our traffic problems as there are many more Search Pages indexed than actual product pages. So, my question is... Should I go ahead and remove the links/tags on the product pages first? OR... If I remove those, will Google then not be able to re-crawl all of the search results pages that it has indexed? Or, if those links are gone will it notice that they are gone, and therefore remove the search results pages they were previously pointing to? So, Should I remove the links/tags from the product page (or at least decrease them down to the top 8 or so) as well as add the no-follow no-index to all the Search Results pages at the same time? OR, should I first no-index, no-follow ALL the search results pages and leave those tags on the product pages there to give Google a chance to go back and follow those tags to all of the Search Results pages so that it can get to all of those Search Results pages in order to noindex,. no follow them? Otherwise will Google not be able find these pages? Can someone comment on what might be the best, safest, or fastest route? Thanks so much for any help you might offer me!! Craig So, I wanted to see if you have a suggestion on the best way to handle it? Should I remove the links/tags from the product page (or at least decrease them down to the top 8 or so) as well as add the no-follow no-index to all the Search Results pages at the same time? OR, should I first no-index, no-follow ALL the search results pages and leave those tags on the product pages there to give Google a chance to go back and follow those tags to all of the Search Results pages so that it can get to all of those Search Results pages in order to noindex,. no follow them? Otherwise will Google not be able find these pages? Can you tell me which would be the best, fastest and safest routes?
Technical SEO | | TheCraig0 -
Duplicate content with "no results found" search result pages
We have a motorcycle classifieds section that lets users search for motorcycles for sale using various drop down menus to pick year-make-type-model-trim, etc.. These search results create urls such as:
Technical SEO | | seoninjaz
www.example.com/classifieds/search.php?vehicle_manufacturer=Triumph&vehicle_category=On-Off Road&vehicle_model=Tiger&vehicle_trim=800 XC ABS We understand that all of these URL varieties are considered unique URLs by Google. The issue is that we are getting duplicate content errors on the pages that have no results as they have no content to distinguish themselves from each other. A URL like:
www.example.com/classifieds/search.php?vehicle_manufacturer=Triumph&vehicle_category=Sportbike
and
www.example.com/classifieds/search.php?vehicle_manufacturer=Honda&vehicle_category=Streetbike Will have a results page that says "0 results found". I'm wondering how we can distinguish these "unique" pages better? Some thoughts:
-make sure <title>reflects what was search<br />-add a heading that may say "0 results found for Triumph On-Off Road Tiger 800 XC ABS"<br /><br />Can anyone please help out and lend some ideas in solving this? <br /><br />Thank you.</p></title>0 -
How to remove entire directory off Google's Cache
The old version of Webmaster tools used to allow you to select whether to remove a single page from index or an entire directory.
Technical SEO | | vpahwa
http://www.canig.com/pageimages/submitremovalrequest.jpg How can I do this with the new Webmaster Tools? I can't find the option to remove an entire directory.0 -
Yahoo/Bing Search
Are there current articles that describe the differences in Yahoo/Bing vs Google SERPs? We have solid top 10 rankings in Google for a bunch of competitive keywords but aren't ranking well in Yahoo.
Technical SEO | | blueroom0 -
How does having an SSL impact search optimization?
My site has a forced redirect on the home page to https://. But that's only on the home page...other site pages are visible. How does this impact search engines crawling my site?
Technical SEO | | AquariumDigital0