How to fix these unwanted URLs?
-
Right now i have wordpress, one page website, but google also show wp-content. KIndly check below in google.
site:http://baltimoreelite.com/
How I can fix this issue?
-
Great job, Mark! I can see from this end that nearly all of those unwanted URLs have already dropped out of the results. That's far quicker than even I expected! And the ones that aren't gone are leading to a 403 Forbidden page, which is great.
One last thing you can do if you want. Because you are on HostGator, they are displaying their custom 403 error page, which has their branding all over it (nasty, kinda ugly) You could create your own simple 403 error page, add your own basic branding to it, and for instance add a line that says something like "You don't have permission to view this page or it is blocked for security reasons. Drop by the home page [link to home page] to find what you're looking for, or to conduct a search."
This basic page can be used to replace the one that HostGator provides by default so any visitors that hit it by accident will still feel like they are on your site, and will have a suggestion for what to do next. Your hosting control panel will have instructions for how & where to provide your own custom error pages.
Hope that last little tweak's useful.
Paul
-
Thank You Paul. All is well now.
-
If I were you, Mark, I'd add it right at the top of your htaccess file. I'd also add in a descriptive comment to make the reason for the directive clear. So:
BEGIN Remove ability to read directory indexes
Options -Indexes
END Remove ability to read directory indexes
These lines would be inserted right at the top of the htaccess file. I would also warn though, that I've had situations where caching plugins have overwritten such directives when they update the htaccess themselves. If that happens, you man need to try inserting it after #END WPSuperCache and before # BEGIN WordPress.
Hope that works for you?
Paul
-
Andy and Nishada - don't forget... Adding robots.txt disallows will do nothing to get already indexed URLs out of the search index after the fact.
Paul
-
You have a much bigger problem than what can be solved just with a robots.txt file, Mark.
All of those URLs are showing up because of a misconfiguration of your theme installation (likely caused by the theme developer) is allowing full display of all of the content of each of those directories. In addition to polluting your search results, as you've noticed, it's a also a pretty major security risk. You can see this in action by going to http://baltimoreelite.com/wp-content/themes/sintia/wpv_theme/assets/css/ What should happen when you go to that URL is you see a blank page, or receive a 403-Forbidden warning. Instead, you're seeing a full listing of the directory contents - bad news.
Since I don't know your hosting configuration, the easiest way to fix this issue is to add a line to your .htaccess file at the root of your site. This should correct for all such instances, You need to add this line:
<code>Options -Indexes</code>
If you're not familiar, the .htaccess file is a text file which you can edit with any text editor. You'll need to use an FTP program or the file manager in your hosting control panel to access it at the root of your site. (You may also need to enable "show hidden files" in your program). I always recommend backing up the existing file before editing just in case. You'll add this new line on its own line with a blank line between it and any other lines in the file. It can go near or at the top of your htaccess file.
Now getting those URLs out of the search index is going to take a bit more work. You'll want to implement the robots.txt exclusions like Andy suggested, and then you'll need to go into Google Webmaster Tools and use the Remove URLs tool to specifically request removal of the directories you blocked with the robots.txt and that you also want removed from the search results. (The robots.txt is a critical part of this process, as Google requires it be in place in order to process the removal requests.)
This, combined with the htaccess edit mentioned above, should keep those URLs from showing up again in teh future.
Hope that all makes sense. If not, be sure to ask!
Paul
-
Hi Mark,
I block the same on my site (which is also a single page). Here is the content of my Robots.txt file.
User-agent: * Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/ Sitemap: http://www.inetseo.co.uk/sitemap.xml.gz
-Andy
-
Hi Mark,
You need to tell crawlers to not to index those content by modifying the robots.txt file. Below is a good link with some examples and instructions
http://stackoverflow.com/questions/17029811/how-to-set-up-robots-txt-file-for-wordpress
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirect_to in URLs?
I've never seen this before. I'm assuming that it's not SEO friendly and that these should be 301s or 302s instead? http://ksa-beta.motory.com/ar/login/?redirect_to=http://ksa-beta.motory.com/ar/cars-for-sale-search/results/central/riyadh/ford/explorer/2010/ford-explorer-2010-1038353 http://ksa-beta.motory.com/ar/login/?redirect_to=http://ksa-beta.motory.com/ar/account/my-saved-searches/
Technical SEO | | KatherineWatierOng0 -
Dynamic Url best approach
Hi We are currently in the process of making changes to our travel site where by if someone does a search this information can be stored and also if the user needs to can take the URL and paste into their browser at find that search again. The url will be dynamic for every search, so in order to stop duplicate content I wanted ask what would be the best approach to create the URLS. ** An example of the URL is: ** package-search/holidays/hotelFilters/?depart=LGW&arrival=BJV&sdate=20150812&edate=20150819&adult=2&child=0&infant=0&fsearch=first&directf=false&nights=7&tsdate=&rooms=1&r1a=2&r1c=0&r1i=0&&dest=3&desid=1&rating=&htype=all&btype=all&filter=no&page=1 I wanted to know if people have previous experience in something like this and what would be the best option for SEO. Will we need to create the URL with a # ( As i read this stops google crawling after the #) Block the folder IN ROBOTS is there any other areas I should be aware of in order stop any duplicate content and 404 pages once the URL/HOLIDAY SEARCH is no longer valid. thanks E
Technical SEO | | Direct_Ram0 -
How do you fix soft 404 errors?
I'm getting soft 404 errors for a gallery on my site: http://chrle.us/MBKs yet when I visit the url: http://www.chrisboar.com/html/galleries/114/portfolio/weddings-1/0 the image/url is there. Not sure how to fix this. note we do redirect the site to the flash site and only use those URLs for SEO purposes. Maybe that is what is causing it. thanks
Technical SEO | | callmeed0 -
GWMT -- Top URLs for keyword is not the homepage
Hi Everyone - In google webmaster tools, in the Content Keywords section, when I click on a link of all of my top keywords they display a single "Top URLs". The URL which they display is not my homepage but a random category landing page of our e-commerce website. Should I be worried about this? Does this suggest that google is ranking this one URL ahead of our homepage?
Technical SEO | | Santaur0 -
Should we use & or and in our url's?
Example: /Zambia/kasanka-&-bangweulu or /Zambia/kasanka-and-bangweulu which is the better url from the search engines point of view?
Technical SEO | | tribes0 -
How do I use only one URL
my site can be reach by both www.site.com and site.com. How do I make it only use www?
Technical SEO | | Weblion0 -
How do fix twin home pages
Search engine analysis is indicating that my site has twin home pages (www.mysite.com and http://mysite.com). The error message I'm getting is: "your website resides at both www.mysite.com and mysite.com. My uploaded index page is a .htm page (not .html). I don't know if that matters. Can someone explain how this happened and what I can do to fix it? Thanks!
Technical SEO | | finalfrontier0 -
/$1 URL Showing Up
Whenever I crawl my site with any kind of bot or a sitemap generator over my site. it comes up with /$1 version of my URLs. For example: It gives me hdiconference.com & hdiconference.com/$1 and hdiconference.com/purchases & hdiconference.com/purchases/$1 Then I get warnings saying that it's duplicate content. Here's the problem: I can't find these /$1 URLs anywhere. Even when I type them in, I get a 404 error. I don't know what they are, where they came from, and I can't find them when I scour my code. So, I'm trying to figure out where the crawlers are picking this up. Where are these things? If sitemap generators and other site crawlers are seeing them, I have to assume that Googlebot is seeing them as well. Any help? My developers are at a loss as well.
Technical SEO | | HDI0