Best way to permanently remove URLs from the Google index?
-
We have several subdomains we use for testing applications. Even if we block with robots.txt, these subdomains still appear to get indexed (though they show as blocked by robots.txt.
I've claimed these subdomains and requested permanent removal, but it appears that after a certain time period (6 months)? Google will re-index (and mark them as blocked by robots.txt).
What is the best way to permanently remove these from the index? We can't use login to block because our clients want to be able to view these applications without needing to login.
What is the next best solution?
-
I agree with Paul, The Google is re indexing the pages because you have few linking pointing back to these sub domains. The best idea us to restrict Google crawler by using no-index , no-follow tag and remove the instruction available in the robots.txt...
This way Google will neither crawl nor follow the activity on the page and it will get permanently remove from Google Index.
-
Yup - Chris has the solution. The robots.txt disallow directive simply instructs the crawler not to crawl, it doesn't have any instructions regarding removing URLs from the index. I'm betting there are other pages linking in to the subdomains that the bots are following to find and index as the URL Removal requests are expiring.
Do note though that when you add the no-index meta-robots tag, you're going to need to remove the robots.txt disallow directive. Otherwise the crawlers won't make any attempt to crawl all the pages and so won't even discover most of the no-index requests.
Paul
[Edited to add - there's no reason you can't implement the no-index meta-tags and then also again request removal via the Webmaster Tools removal tool. Kind of a "belt & suspenders approach. The removal request will get it out quicker, and the meta-no-index will do the job of keeping it out. Remember to do this in Bing Webmaster Tools as well.]
-
Wouldn't a noindex meta tag on each page take care of it?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Displaying Vanity URL in Google Search Result
Hi Moz! Not sure if this has been asked before, but is there any way to tell Google to display a vanity URL (that has been 301d) instead of the actual URL in the SERP? Example: www.domainA.com is a vanity URL (bought specifically for Brand Identity reasons) that redirects to www.domainB.com. Is it possible to have the domainA Url show up in Google for a Branded search query? Thanks in advance! Arjun
Intermediate & Advanced SEO | | Lauriedechaseaux0 -
Best way to structure urls wordpress and Yoast?
I am using Wordpress and Yoast. I have Parent pages and child pages. Yoast recommends you have the keyword in the url. For the parent page I have the city name in the url. Question is, should the child pages also have the city name in the url or would that be considered keyword stuffing? Here is the current structure. http://forestparkdental.info/st-louis-dental-services/restorative-dentistry/inlays-and-onlays So didn't know if should have the end of that url as /restorative-dentistry-st-louis /inlays-and-onlays-st louis since those are separate pages and Yoast and Moz plugin doesn't give you the Green light in in all areas unless you do it like this? Thanks Scott
Intermediate & Advanced SEO | | scott3150 -
Best way to duplicate a wordpress site for staging purposes?
I want to make some changes to my Wordpress site, and want to somehow set up a live staging area. Does anyone know of a good way to do this? I want all of the same content there I just want to be able to make changes to it and try it all out before going live. Any thoughts on this? Also I want to be sure the staging site doesn't get indexed since it will be a complete duplicate of my existing site. Thanks!
Intermediate & Advanced SEO | | NoahsDad0 -
Does Google Index an Alert Div w/Delayed Hide
We have a div at the top of a client's the page that displays an alert to the user. After 30 seconds it is rendered hidden. Does Google index this? Does Google take this into account when it ranks the page?
Intermediate & Advanced SEO | | WEOMedia0 -
Removing Dynamic "noindex" URL's from Index
6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google. It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index. I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?
Intermediate & Advanced SEO | | BeTheBoss0 -
How to deal with old, indexed hashbang URLs?
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/). My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries. So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #). I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way? If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue. Best, -G
Intermediate & Advanced SEO | | Celts180 -
URL Parameter is not available in website which was monitored by Google
I was checking URL parameters section over Google webmaster tools. Google have monitored following parameters and exclude it from crawling. utm_campaign utm_medium utm_source I have built URLs with following tool to track visits from vertical search engine like Google shopping and other comparison shopping engines. http://www.google.com/support/analytics/bin/answer.py?answer=55578 So, I am quite confuse to see over my data. Will Google consider external URLs which are available with above parameters or require to consist on live website? Note: I am asking for my eCommerce website. http://www.lampslightingandmore.com/
Intermediate & Advanced SEO | | CommercePundit0 -
Google indexing flash content
Hi Would googles indexing of flash content count towards page content? for example I have over 7000 flash files, with 1 unique flash file per page followed by a short 2 paragraph snippet, would google count the flash as content towards the overall page? Because at the moment I've x-tagged the roberts with noindex, nofollow and no archive to prevent them from appearing in the search engines. I'm just wondering if the google bot visits and accesses the flash file it'll get the x-tag noindex, nofollow and then stop processing. I think this may be why the panda update also had an effect. thanks
Intermediate & Advanced SEO | | Flapjack0