How to Hide Directories in Search?
-
I noticed bad 404 error links in Google Webmaster Tools and they were pointing to directories that do not have an actual page, but hold information.
Ex: there are links pointing to our PDF folder which holds all of our pdf documents. If i type in , example.com/pdf/ it brings up a unformated webpage that displays all of our PDF links.
How do I prevent this from happening. Right now I am blocking these in my robots.txt file, but if i type them in, they still appear.
Or should I not worry about this?
-
Yes, a visit to example.com/dir should now return a 404 error (if you haven't done any redirecting/canonicalizing). This will increase your 404 count in Web Master tools but it's far preferable to the alternative. If you're not redirecting the robots.txt will eventually work and hopefully the links will just fall out of WMT.
-
My hosting company turned off directory browsing and now everything is how it should be. So to my understanding, if the server sees a file that does not have a index file, it should not be view able and should be forbidden. This shoujld not affect us from an SEO standpoint should it? My hosting company said they disabled all directories in our site, however everything still works, except for the forbidden file directories.
-
Basically it shouldn't really have an affect; those unformatted file listings are literally the web server automatically saying 'here's the files that are in this folder', there's no meta tags, description, on page elements, etc.
If you have these pages and they're ranking well, you generally don't want them to be. The automatic file browsing pages don't have your name, your company, etc. in them, and they're generally pretty ugly. They also theoretically could be 'stealing' juice from your 'real' pages, if your internal structure isn't flowing relevance properly.
Basically what I'm saying is that if these pages are having some kind of SEO effect, you probably don't want them to be since they're so basic.
Also I can't overstate the security concerns that directory browsing might be introducing. If someone can directory browse to where your code lives (.php, .aspx.vb, whatever) they may be able to read it. Code sometimes has important things like logins, passwords, merchant account ids, etc. in it that you definitely don't want people reading.
-
Agreed with Valerie that step 1 is to turn off those directory listing pages - that can be a security issue and you don't necessarily want people to see/access the whole list. Also, make doubly sure you don't have any internal links to that directory (Google crawled it somehow).
Generally, Robots.txt should prevent crawling, but it's not foolproof, and it's pretty bad about removing pages once they're indexed. If you can block the page from browsing and return a 404 for the root page, that should be fine. The other option would be to have the page removed in Google Webmaster Tools. You could request removal for the entire folder, but I'm guessing that you may want the actual PDFs indexed.
-
Will turning of directory browsing affect Search for all directories?
-
I really don't want to 301 redirect them as they are just holding files. This is happening with my includes file too. that holds our header, footer, navigation etc. I can check with our hosting company to find out.
-
I'd create an index.html for the directory, and then redirect it somewhere. This way, you're capturing the inbound links and then rescuing some of the inbound juice.
Otherwise, you can also check out this post for more info on other solutions and modifying your htaccess file to prevent the directory view - http://perishablepress.com/better-default-directory-views-with-htaccess/
-
Blocking it in robots.txt will work to hide it from search engines.
If you want to hide it from users or people to who type in the url, you can simply drop a blank "index.html" in the /pdf folder.
-
I would suggest 301'ing them to their /index.htm or /pdf.htm equivalents. If you don't know, a 301 is a signal to a web browser (or search crawler) saying "this page has permanently moved, please go to (otherpage.htm) instead".
Here's a good SEOMoz article explaining it a bit more:
http://www.seomoz.org/learn-seo/redirection
What might be more of a concern, is it sounds like your web server has directory browsing enabled. This could be a security issue (depending on your web server setup). Generally you don't want to expose directories if you don't have to because it gives a potential attacker insight into your system setup. Here's an example how to do it in Apache:
www.camelrichard.org/topics/Apache/Turn_OffDirectoryBrowsing
And IIS:
technet.microsoft.com/en-us/library/cc731109(v=ws.10).aspx
If you like I can confirm if you have open directories if you give me the link, either here or through private message.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Wrong Search words coming in search console
Hey there, My website All good but, in webmaster Search console some bad Queries(search terms) coming which is totally different from website. I want to make sure, is that harmful for my website traffic, as well as keywords Ranking?? How should i stop them to be crawl, ?? can any help for this query.?? i have attached screenshot of that, please check & help out, http://prntscr.com/cmusoq Thnx in advance.
Intermediate & Advanced SEO | | poojaverify060 -
How to maximize CTR from Google image search?
I'm getting good, solid growth in my Google SERPs and Google search traffic now, but I do notice that 70% of my high ranking search results are images and the CTR on those is only 3-4%. All my images are illustrative and highly relevant to my travel blog, but I guess that hardly matters unless they get CTR so people see them in context. Has anyone seen or done any good research on what makes people click through on Google Image Search results? What are the key factors? How do you optimize for click-through? Is it better to watermark your images or overlay label them to increase likelihood of click-through? Thanks, Tony FYI the travel blog in question is www.asiantraveltips.com and a relevant Google search where I rank highly is "songkran 2016 phuket".
Intermediate & Advanced SEO | | Gavin.Atkinson0 -
Getting too many links on Google search results, how do I fix?
I'm a total newbie so I apologize for what I am sure is a dumb question — I recently followed Moz suggestions for increasing visibility on my site for a specific keyword by including that keyword in more verbose page descriptions for multiple pages. This worked TOO well as now that keyword is bringing up too many results in Google for these different pages on my site . . . is there a way to compile them into one result with the subpages like for instance, the attached image for a search on Apple? Do I need to change something in my robots.txt file to direct these to my main page? Basically, I am a photographer and a search for my name now brings up each of my different photo gallery pages in multiple results, it's a little over the top. Thanks for any and all help! CNPJZgb
Intermediate & Advanced SEO | | jason54540 -
Making Filtered Search Results Pages Crawlable on an eCommerce Site
Hi Moz Community! Most of the category & sub-category pages on one of our client's ecommerce site are actually filtered internal search results pages. They can configure their CMS for these filtered cat/sub-cat pages to have unique meta titles & meta descriptions, but currently they can't apply custom H1s, URLs or breadcrumbs to filtered pages. We're debating whether 2 out of 5 areas for keyword optimization is enough for Google to crawl these pages and rank them for the keywords they are being optimized for, or if we really need three or more areas covered on these pages as well to make them truly crawlable (i.e. custom H1s, URLs and/or breadcrumbs)…what do you think? Thank you for your time & support, community!
Intermediate & Advanced SEO | | accpar0 -
Google Search Listing With Feedback Link
Where can I find some information on the new Google search listing that shows a Feedback link? How does one get this type of Google search listing?
Intermediate & Advanced SEO | | marketvantageteam0 -
Google Is Indexing My Internal Search Results - What should i do?
Hello, We are using a CMS/E-Commerce platform which isn't really built with SEO in mind, this has led us to the following problem.... a large number of internal (product search) search result pages, which aren't "search engine friendly" or "user friendly", are being indexed by google and are driving traffic to the site, generating our client revenue. We want to remove these pages and stop them from being indexed, replacing them with static category pages - essentially moving the traffic from the search results to static pages. We feel this is necessary as our current situation is a short-term (accidental) win and later down the line as more pages become indexed we don't want to incur a penalty . We're hesitant to do a blanket de-indexation of all ?search results pages because we would lose revenue and traffic in the short term, while trying to improve the rankings of our optimised static pages. The idea is to really move up our static pages in Google's index, and when their performance is strong enough, to de-index all of the internal search results pages. Our main focus is to improve user experience and not have customers enter the site through unexpected pages. All thoughts or recommendations are welcome. Thanks
Intermediate & Advanced SEO | | iThinkMedia0 -
Rankings and search traffic fell off a cliff
Hi Moz community, One of my clients has a beast of a website built in ASP.NET (which causes me problems cos I don't have much experience in that) It is a job-site that aggregates job opportunities from other job-sites and provides a job matching service by email etc. They used to have great presence on Google naturally for thousands of job searches. Since Penguin and Penguin 2.0 (I think) their traffic has fallen off a cliff. I have been doing some "off-page" experimentation, seeing if we can fix a lot of issues by re-sculpting their backlink profile (seeing as it was after penguin). but what I have found is that some pages respond to this off page work but some just do not at all, despite how we approach it, such as disavowing previous links building fresh new top quality content links with natural anchor text etc.... Which has lead me to the conclusion that the wider issue is on-page and potentially site structure. Unfortunately as it is ASP.NET I am not so comfortable diagnosing the issues. I think also some issues will be related to dupe content etc.... but I would LOVE to get some input from my learned Moz colleagues. The website is http://www.allthetopbananas.com/ - any tips on how to recover from this dramatic loss of traffic would be massively appreciated. Kind regards
Intermediate & Advanced SEO | | websearchseo0 -
Would using display:none; to hide a section of text effect SEO negatively?
I have seen several sites that put a div feature at the bottom of a page to hide content. If you click on the button, it will extend the page down and be loaded with paragraphs of text rich with keywords. Does anyone know is this is viewed as a negative with Google?
Intermediate & Advanced SEO | | netmkting0