How to Hide Directories in Search?
-
I noticed bad 404 error links in Google Webmaster Tools and they were pointing to directories that do not have an actual page, but hold information.
Ex: there are links pointing to our PDF folder which holds all of our pdf documents. If i type in , example.com/pdf/ it brings up a unformated webpage that displays all of our PDF links.
How do I prevent this from happening. Right now I am blocking these in my robots.txt file, but if i type them in, they still appear.
Or should I not worry about this?
-
Yes, a visit to example.com/dir should now return a 404 error (if you haven't done any redirecting/canonicalizing). This will increase your 404 count in Web Master tools but it's far preferable to the alternative. If you're not redirecting the robots.txt will eventually work and hopefully the links will just fall out of WMT.
-
My hosting company turned off directory browsing and now everything is how it should be. So to my understanding, if the server sees a file that does not have a index file, it should not be view able and should be forbidden. This shoujld not affect us from an SEO standpoint should it? My hosting company said they disabled all directories in our site, however everything still works, except for the forbidden file directories.
-
Basically it shouldn't really have an affect; those unformatted file listings are literally the web server automatically saying 'here's the files that are in this folder', there's no meta tags, description, on page elements, etc.
If you have these pages and they're ranking well, you generally don't want them to be. The automatic file browsing pages don't have your name, your company, etc. in them, and they're generally pretty ugly. They also theoretically could be 'stealing' juice from your 'real' pages, if your internal structure isn't flowing relevance properly.
Basically what I'm saying is that if these pages are having some kind of SEO effect, you probably don't want them to be since they're so basic.
Also I can't overstate the security concerns that directory browsing might be introducing. If someone can directory browse to where your code lives (.php, .aspx.vb, whatever) they may be able to read it. Code sometimes has important things like logins, passwords, merchant account ids, etc. in it that you definitely don't want people reading.
-
Agreed with Valerie that step 1 is to turn off those directory listing pages - that can be a security issue and you don't necessarily want people to see/access the whole list. Also, make doubly sure you don't have any internal links to that directory (Google crawled it somehow).
Generally, Robots.txt should prevent crawling, but it's not foolproof, and it's pretty bad about removing pages once they're indexed. If you can block the page from browsing and return a 404 for the root page, that should be fine. The other option would be to have the page removed in Google Webmaster Tools. You could request removal for the entire folder, but I'm guessing that you may want the actual PDFs indexed.
-
Will turning of directory browsing affect Search for all directories?
-
I really don't want to 301 redirect them as they are just holding files. This is happening with my includes file too. that holds our header, footer, navigation etc. I can check with our hosting company to find out.
-
I'd create an index.html for the directory, and then redirect it somewhere. This way, you're capturing the inbound links and then rescuing some of the inbound juice.
Otherwise, you can also check out this post for more info on other solutions and modifying your htaccess file to prevent the directory view - http://perishablepress.com/better-default-directory-views-with-htaccess/
-
Blocking it in robots.txt will work to hide it from search engines.
If you want to hide it from users or people to who type in the url, you can simply drop a blank "index.html" in the /pdf folder.
-
I would suggest 301'ing them to their /index.htm or /pdf.htm equivalents. If you don't know, a 301 is a signal to a web browser (or search crawler) saying "this page has permanently moved, please go to (otherpage.htm) instead".
Here's a good SEOMoz article explaining it a bit more:
http://www.seomoz.org/learn-seo/redirection
What might be more of a concern, is it sounds like your web server has directory browsing enabled. This could be a security issue (depending on your web server setup). Generally you don't want to expose directories if you don't have to because it gives a potential attacker insight into your system setup. Here's an example how to do it in Apache:
www.camelrichard.org/topics/Apache/Turn_OffDirectoryBrowsing
And IIS:
technet.microsoft.com/en-us/library/cc731109(v=ws.10).aspx
If you like I can confirm if you have open directories if you give me the link, either here or through private message.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Subdirectory site / 301 Redirects / Google Search Console
Hi There, I'm a web developer working on an existing WordPress site (Site #1) that has 900 blog posts accessible from this URL structure: www.site-1.com/title-of-the-post We've built a new website for their content (Site #2) and programmatically moved all blog posts to the second website. Here is the URL structure: www.site-1.com/site-2/title-of-the-post Site #1 will remain as a normal company site without a blog, and Site #2 will act as an online content membership platform. The original 900 posts have great link juice that we, of course, would like to maintain. We've already set up 301 redirects that take care of this process. (ie. the original post gets redirected to the same URL slug with '/site-2/' added. My questions: Do you have a recommendation about how to best handle this second website in Google Search Console? Do we submit this second website as an additional property in GSC? (which shares the same top-level-domain as the original) Currently, the sitemap.xml submitted to Google Search Console has all 900 blog posts with the old URLs. Is there any benefit / drawback to submitting another sitemap.xml from the new website which has all the same blog posts at the new URL. Your guidance is greatly appreciated. Thank you.
Intermediate & Advanced SEO | | HimalayanInstitute0 -
Mapping ALL search data for a broad topic
Hi All As our company becomes a bigger and bigger entity I'm trying to figure out how I can create more autonomy. One of the key areas that needs fixing is briefing the writers on articles based on keywords. We're not just trying to go after the low hanging fruit or the big money keywords but actually comprehensively cover every topic and provide actual good quality up to date info (surprisingly rare in a competitive niche) and eventually cover pretty much every topic there is. We generally work on a 3 tier system on a folder level, topics and then sub-topics. The challenge is getting an agency to: a) be able to pull all of the data without being knowledgeable in our specific industry. We're specialists and, thus, target people that need specialist expertise as well as more mainstream stuff (the stuff that run of the mill people wouldn't know about). b) know where it all fits topically as we kind of organise the content on a heirarchy basis. And we generally cover multiple smaller topics within articles. Am I asking for the impossible here? It's the one area of the business I feel the most nervous about creating autonomy with. Can we become be as extensive and comprehensive as a wiki-type website without having somebody within the business that knows it providing the keyword research. I did a searh for all data using the main two seed keywords for this subject on ahrefs and it came up with 168000 lines of spreadsheet data. Obviously this went way beyond the maximum I was allowed to export. Interested in feedback and, if any agencies are up for the challenge, do let me know! I've been using moz pro for a long time but have never posted and apologise if what I'm describing is being explained badly here. Requirements Keywords to cover all (broad niche) related queries in the UK, no relevant uk (broad niche) keywords will be missed Organised in a way that can be interpreted as article brief and folder structure instructions. Questions How would you ensure you cover every single keyword? Assuming no specialist X knowledge, how will you be able to map content and know which search queries belong in which topics and in what order. Also (where there is keyword leakage from other regions) how will you know which are UK terms and which aren’t? With minimal X knowledge – how will you know whether you’ve missed an opportunity or not (what you don’t know you don’t know) What specific resources will you require from us in order for this to work? What format will the data be provided to us in - how will you present the finished work so that it can be turned into article briefs?
Intermediate & Advanced SEO | | d.bird0 -
Search console site verification
I've been going on the assumption that when verifying a website in search console, it's always good to register and verify all variants of the site URL: http https www non-www However, if you create redirects to the preferred URL, is it really necessary to register/virfy of the other three? If so, why?
Intermediate & Advanced SEO | | muzzmoz0 -
Why our page not ranking even searching for exact h1 tag?
Even I search for exact h1 tag heading from our homepage, it's (homepage) not been showing up on TOP of the results. Other websites with partial match of search query are ranking above us; why this is happening? And other website with same text as normal paragraph is ranking on top. But not out h1 tag from homepage? How come normal text of unrelated website is ranking above h1 heading from homepage of own website?
Intermediate & Advanced SEO | | vtmoz0 -
Localized Domain Issue - Can I use Search Console to solve this?
Struggling through trying to resolve a complicated search issue - would appreciate any community input or suggestions. The Background Info We have several brand sites and each one has both a .ca and .com domain. For some reason, our website platform was created in a way that hundreds of pages on the .com domain have an equivalent page on the .ca domain, which are all 301'ed to the appropriate .com pages. Example below for clarity: www.domain.ca/gadget/brand - 301 Redirected to: www.domain.com/gadget/brand www.domain.ca/gadget/en/brandcanada = Proper .ca Canadian URL (where en is the language - fr exists as well) The Problem Because these .com pages exist under the .ca domain as well, they have started to outrank the correct .ca pages on Google. This has led to Canadian customers finding incorrect information, pricing, and reviews for these products - causing all sorts of customer service issues and therefore affecting our sales. I am being told that to properly fix the issue, and remove the incorrect URLs under the .ca domain would be prohibitively expensive in terms of resources, so I'm left trying to fix this via means available to me (i.e. anything but a change to how the platform is currently setup). The Attempted Fix I've submitted proper sitemaps for the .ca brand sites, and we have also created a robots.txt file to be accessed only when the site is crawled through the .ca domain. In that robots.txt, we have Disallowed crawling of any /gadget/brand/ URLs for the .ca domain. This was done a week ago and I am still seeing the .com URL show up in search results. The Question Should I be submitting any www.brand.ca/gadget/brand/ URLs to be temporarily removed from Google? Because of the 301 redirect in place from www.brand.ca/gadget/brand to www.brand.com/gadget/brand, I am hesitant to do so, as I do not want the .com URL removed. Will Google simply remove the .ca URL and not follow the 301 redirect to remove that URL as well? Any additional insight or feedback would be awesome as well.
Intermediate & Advanced SEO | | Trevor-O0 -
Site: search showing funny results
Hi When i do a site: search on my domain the very last result it returns is a URL which is listed as my domain but does not exist on my website. When clicked it redirects to a really spammy page. If im not being clear just let me know, quite hard to explain the situation! Any thoughts to get rid of this?
Intermediate & Advanced SEO | | TheZenAgency0 -
Natural Fluctuation in Search Traffic
This is going to sound like a weird question... I'm curious to know whether there is a natural fluctuation in the actual number of searches being made online each week. It would be great to relate this to the performance of my own organic traffic each week. For example, if organic search traffic is down 10% week on week, is that because search in general is down 10%? Has anybody ever looking into this?
Intermediate & Advanced SEO | | ausmed0 -
Sudden drop in ranking for major search terms
Site bumpbabyandbeyond com au. Help! We have been operating for six years and had steadily built up our ranking for major terms like maternity clothes, maternity clothing, maternity wear, reaching highs of 6, 8 and 10 respectively for AU sites AU wide about six months ago that we have steadily maintained. All of a sudden we have dropped away. A week ago I noticed we had dropped from 6 to 12 for maternity clothes. This morning we are 21! I can't see any obvious reason for this, but believe the eCommerce module of our inventory/pos software has had a recent update - I'm awaiting answers on this. We haven't actively had anyone link building or working on SEO after being badly bitten and shelling out a small fortune for an AU company to do very little over six months - rankings improved rapidly when I sacked them and did some on page minor work myself. But I don't have the time or knowledge to look after the seo, and am on the hunt for reputable white hat assistance. Is there anything obviously wrong that I need to fix ASAP? Any help would be much appreciated 🙂
Intermediate & Advanced SEO | | catfree0