Removing a site from Google's index
-
We have a site we'd like to have pulled from Google's index. Back in late June, we disallowed robot access to the site through the robots.txt file and added a robots meta tag with "no index,no follow" commands. The expectation was that Google would eventually crawl the site and remove it from the index in response to those tags. The problem is that Google hasn't come back to crawl the site since late May. Is there a way to speed up this process and communicate to Google that we want the entire site out of the index, or do we just have to wait until it's eventually crawled again?
-
ok. Not abundantly clear upon first reading. Thank you for your help.
-
Thank you for pointing that out Arlene. I do see it now.
The statement before that line is of key importance for an accurate quote. "If you own the site, you can verify your ownership in Webmaster Tools and use the verified URL removal tool to remove an entire directory from Google's search results."
It could be worded better but what they are saying is AFTER your site has already been removed from Google's index via the URL removal tool THEN you can block it with robots.txt. The URL removal tool will remove the pages and keep them out of the index for 90 days. That's when changing the robots.txt file can help.
-
"Note: To ensure your directory or site is permanently removed, you should use robots.txt to block crawler access to the directory (or, if you’re removing a site, to your whole site)."
The above is a quote from the page. You have to expand the section I referenced in my last comment. Just re-posting google's own words.
-
I thought you were offering a quote from the page. It seems that is your summarization. I apologize for my misunderstanding.
I can see how you can make that conclusion but it not accurate. Robots.txt does not ensure a page wont get indexed. I always recommend use of the noindex tag which should be 100% effective for the major search engines.
-
Go here: http://www.google.com/support/webmasters/bin/answer.py?answer=164734
Then expand the option down below that says: "<a class="zippy zippy-track zippy-collapse" name="RemoveDirectory">I want to remove an entire site or the contents of a directory from search results"</a>
They basically instruct you to block all robots in the robots.txt file, then request removal of your site. Once it's removed, the robots file will keep it from getting back into the index. They also recommend putting a "noindex" meta tag on each page to ensure nothing will get picked up. I think we have it taken care of at this point. We'll see
-
Arlene, I checked the link you offered but I could not locate the quote you offered anywhere on the page. I am sure it is referring to a different context. Using robots.txt as a blocking tool is fine BEFORE a site or page is indexed, but not after.
-
I used the removal tool and just entered a "/" which put in a request to have everything in all of my site's directories pulled from the index. And I have left "noindex" tags in place on every page. Hopefully this will get it done.
Thanks for your comments guys!
-
We blocked robots from accessing the site because Google told us to. This is straight from the webmaster tools help section:
Note: To ensure your directory or site is permanently removed, you should use robots.txt to block crawler access to the directory (or, if you’re removing a site, to your whole site).
-
I have webmaster tools setup, but I don't see an option to remove the whole site. There is a URL removal tool, but there are over 700 pages I want pulled out of the index. Is there an option in webmaster tools to have the whole site pulled from the index?
-
Actually, since you have access to the site, you can leave the robots.txt at disallowed -- if you go into Google Webmaster Tools, verify your site, and request removal of your entire site. Let me know if you'd like a link on this with more information. This will involve adding an html file or meta tag to your site to verify you have ownership.
-
Thank you. Didn't realize we were shooting ourselves in the foot.
-
Hi Arlene.
The problem is that when you blocked the site with robots.txt, you are preventing Google from re-crawling your site so they cannot see the noindex tag. If you have properly placed the noindex tag on all the pages in your site, then modify your robots.txt file to allow Google to see your site. Once that happens Google will begin crawling your site and then be able to deindex your pages.
The only other suggestion is to submit a sitemap and/or remove the "nofollow" tag. With the nofollow tag on all your pages, Google may visit your site for a single page at a time since you are telling the crawler not to follow any links it finds. You are blocking it's normal discovery of your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can't get Google to index our site although all seems very good
Hi there, I am having issues getting our new site, https://vintners.co indexed by Google although it seems all technical and content requirements are well in place for it. In the past, I had way poorer websites running with very bad setups and performance indexed faster. What's concerning me, among others, is that the crawler of Google comes from time to time when looking on Google Search Console but does not seem to make progress or to even follow any link and the evolution does not seem to do what google says in GSC help. For instance, our sitemap.xml was submitted, for a few days, it seemed like it had an impact as many pages were then visible in the coverage report, showing them as "detected but not yet indexed" and now, they disappeared from the coverage report, it's like if it was not detected any more. Anybody has any advice to speed up or accelerate the indexing of a new website like ours? It's been launched since now almost two months and I was expected, at least on some core keywords, to quickly get indexed.
Technical SEO | | rolandvintners1 -
Website homepage temporarily getting removed from google index
hi, website: www.snackmagic.com The home page goes out of google index for some hours and then comes back. We are not sure why our home page is getting de-indexed temporarily. This doesn't happen with other pages on our website. This has been happening intermittently in the gap of 2-3 days. Any inputs will be very useful for us to debug this issue Thanks
Technical SEO | | manikbystadium0 -
Google Indexing Pages with Made Up URL
Hi all, Google is indexing a URL on my site that doesn't exist, and never existed in the past. The URL is completely made up. Anyone know why this is happening and more importantly how to get rid of it. Thanks 🙂
Technical SEO | | brian-madden0 -
How preproduction website is getting indexed in Google.
Hi team, Can anybody please help me to find how my preproduction website and urls are getting indexed in Google.
Technical SEO | | nlogix0 -
Removed Subdomain Sites Still in Google Index
Hey guys, I've got kind of a strange situation going on and I can't seem to find it addressed anywhere. I have a site that at one point had several development sites set up at subdomains. Those sites have since launched on their own domains, but the subdomain sites are still showing up in the Google index. However, if you look at the cached version of pages on these non-existent subdomains, it lists the NEW url, not the dev one in the little blurb that says "This is Google's cached version of www.correcturl.com." Clearly Google recognizes that the content resides at the new location, so how come the old pages are still in the index? Attempting to visit one of them gives a "Server Not Found" error, so they are definitely gone. This is happening to a couple of sites, one that was launched over a year ago so it doesn't appear to be a "wait and see" solution. Any suggestions would be a huge help. Thanks!!
Technical SEO | | SarahLK0 -
We can't figure out why competitors have better position(s) in Google
We are using MOZ analytics for some days now, and it really helps us with important information about our rankings.
Technical SEO | | wilcoXXL
I hope you guys can help us out with the following particular case; In google.nl (dutch) we rank position #18 with the following searchterm 'sphinx 345' one of our competitors rank position #3.
We used the MOZ On Page Grade tool to find out some details about the two pages:
Our page #18: http://goo.gl/cTsbmI
Competitor page #3: http://goo.gl/qk21sM Our page hits an A and Keyword usage for "sphinx 345" = 52
The competitors page hits an A too and Keyword usage for "sphinx 345" = 45 About the link structure; for our page there is no link data found in Open Site Explorer. The url exists about a year and a half now.
I'm also very sure we have many internal links to this url.
Does Google and other crawlers have a hard time to crawl our site?(it's a Magento site, our competitors do have custom-made e-commerce systems, maybe that has something to do with it?) As i were saying;we can't figure this out. I hope you guys can help to get us any further. Regards, Wilco0 -
Can I turn off Google site links?
I thought at one time I had turned off the option to have Google sitelinks. I did this so that each of our pages that had a strong presence would occupy a unique slot on the first and second page of Google. This was important to us as we were battling some reputation management issues and trying to push out negative listings from the front page. Recently I noticed sitelinks were back up and when going into Google Webmaster Tools, I could figure out how to opt out of them. Any suggestions?
Technical SEO | | BRConsulting0 -
Google +1 Button on Flash sites
One of my customers is willing to add Google +1 button on their Flash websites. Is it possible? How can we add Google +1 button on a Flash site? Thanks in advance!
Technical SEO | | merkal20050