Removing a site from Google's index
-
We have a site we'd like to have pulled from Google's index. Back in late June, we disallowed robot access to the site through the robots.txt file and added a robots meta tag with "no index,no follow" commands. The expectation was that Google would eventually crawl the site and remove it from the index in response to those tags. The problem is that Google hasn't come back to crawl the site since late May. Is there a way to speed up this process and communicate to Google that we want the entire site out of the index, or do we just have to wait until it's eventually crawled again?
-
ok. Not abundantly clear upon first reading. Thank you for your help.
-
Thank you for pointing that out Arlene. I do see it now.
The statement before that line is of key importance for an accurate quote. "If you own the site, you can verify your ownership in Webmaster Tools and use the verified URL removal tool to remove an entire directory from Google's search results."
It could be worded better but what they are saying is AFTER your site has already been removed from Google's index via the URL removal tool THEN you can block it with robots.txt. The URL removal tool will remove the pages and keep them out of the index for 90 days. That's when changing the robots.txt file can help.
-
"Note: To ensure your directory or site is permanently removed, you should use robots.txt to block crawler access to the directory (or, if you’re removing a site, to your whole site)."
The above is a quote from the page. You have to expand the section I referenced in my last comment. Just re-posting google's own words.
-
I thought you were offering a quote from the page. It seems that is your summarization. I apologize for my misunderstanding.
I can see how you can make that conclusion but it not accurate. Robots.txt does not ensure a page wont get indexed. I always recommend use of the noindex tag which should be 100% effective for the major search engines.
-
Go here: http://www.google.com/support/webmasters/bin/answer.py?answer=164734
Then expand the option down below that says: "<a class="zippy zippy-track zippy-collapse" name="RemoveDirectory">I want to remove an entire site or the contents of a directory from search results"</a>
They basically instruct you to block all robots in the robots.txt file, then request removal of your site. Once it's removed, the robots file will keep it from getting back into the index. They also recommend putting a "noindex" meta tag on each page to ensure nothing will get picked up. I think we have it taken care of at this point. We'll see
-
Arlene, I checked the link you offered but I could not locate the quote you offered anywhere on the page. I am sure it is referring to a different context. Using robots.txt as a blocking tool is fine BEFORE a site or page is indexed, but not after.
-
I used the removal tool and just entered a "/" which put in a request to have everything in all of my site's directories pulled from the index. And I have left "noindex" tags in place on every page. Hopefully this will get it done.
Thanks for your comments guys!
-
We blocked robots from accessing the site because Google told us to. This is straight from the webmaster tools help section:
Note: To ensure your directory or site is permanently removed, you should use robots.txt to block crawler access to the directory (or, if you’re removing a site, to your whole site).
-
I have webmaster tools setup, but I don't see an option to remove the whole site. There is a URL removal tool, but there are over 700 pages I want pulled out of the index. Is there an option in webmaster tools to have the whole site pulled from the index?
-
Actually, since you have access to the site, you can leave the robots.txt at disallowed -- if you go into Google Webmaster Tools, verify your site, and request removal of your entire site. Let me know if you'd like a link on this with more information. This will involve adding an html file or meta tag to your site to verify you have ownership.
-
Thank you. Didn't realize we were shooting ourselves in the foot.
-
Hi Arlene.
The problem is that when you blocked the site with robots.txt, you are preventing Google from re-crawling your site so they cannot see the noindex tag. If you have properly placed the noindex tag on all the pages in your site, then modify your robots.txt file to allow Google to see your site. Once that happens Google will begin crawling your site and then be able to deindex your pages.
The only other suggestion is to submit a sitemap and/or remove the "nofollow" tag. With the nofollow tag on all your pages, Google may visit your site for a single page at a time since you are telling the crawler not to follow any links it finds. You are blocking it's normal discovery of your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search Console 'Change of Address' Just 301s on source domain?
Hi all. New here, so please be gentle. 🙂 I've developed a new site, where my client also wanted to rebrand from .co.nz to .nz On the source (co.nz) domain, I've setup a load of 301 redirects to the relevant new page on the new domain (the URL structure is changing as well).
Technical SEO | | WebGuyNZ
E.G. On the old domain: https://www.mysite.co.nz/myonlinestore/t-shirt.html
In the HTACCESS on the old/source domain, I've setup 301's (using RewriteRule).
So that when **https://www.mysite.co.nz/**myonlinestore/t-shirt.html is accessed, it does a 301 to;
https://mysite.nz/shop/clothes/t-shirt All these 301's are working fine. I've checked in dev tools and a 301 is being returned. My question is, is having the 301's just on the source domain only enough, in regards to starting a 'Change of Address' in Google's Search Console? Their wording indicates it's enough but I'm concerned, maybe I also need redirects on the target domain as well? I.E. Does the Search Console Change of Address process work this way?
It looks at the source domain URL (that's already in Google's index), sees the 301 then updates the index (and hopefully pass the link juice) to the new URL. Also, I've setup both source and target Search Console properties as Domain Properties. Does that mean I no longer need to specify that the source and target properties are HTTP or HTTPS? I couldn't see that option when I created the properties. Thanks!0 -
Do URLs with canonical tags get indexed by Google?
Hi, we re-branded and launched a new website in February 2016. In June we saw a steep drop in the number of URLs indexed, and there have continued to be smaller dips since. We started an account with Moz and found several thousand high priority crawl errors for duplicate pages and have since fixed those with canonical tags. However, we are still seeing the number of URLs indexed drop. Do URLs with canonical tags get indexed by Google? I can't seem to find a definitive answer on this. A good portion of our URLs have canonical tags because they are just events with different dates, but otherwise the content of the page is the same.
Technical SEO | | zasite0 -
These days on Google results, it also shows the site map. I submitted my company's sitemap and it still does not show?What am I doing wrong?
Look at the image in the link. I want my company to look like the "pluralsight" website in Google. I want it to show the sitemap. I have already submitted the sitemap to Google few days back, what am I doing wrong? search?sourceid=chrome-psyapi2&ion=1&espv=2&ie=UTF-8&q=pluralsight&oq=pluralsight&aqs=chrome..69i57j0l5.11024j0j8
Technical SEO | | Deein0 -
Why is Google Webmaster Tools showing 404 Page Not Found Errors for web pages that don't have anything to do with my site?
I am currently working on a small site with approx 50 web pages. In the crawl error section in WMT Google has highlighted over 10,000 page not found errors for pages that have nothing to do with my site. Anyone come across this before?
Technical SEO | | Pete40 -
Why did Google stop indexing my site?
Google used to crawl my site every few minutes. Suddenly it stopped and the last week it indexed 3 pages out of thousands. https://www.google.co.il/#q=site:www.yetzira.com&source=lnt&tbs=qdr:w&sa=X&ei=I9aTUfTTCaKN0wX5moCgAw&ved=0CBgQpwUoAw&bav=on.2,or.r_cp.r_qf.&fp=cfac44f10e55f418&biw=1829&bih=938 What could cause this to happen and how can I solve this problem? Thanks!
Technical SEO | | JillB20130 -
To integrate a blog tool onto site - or build a blog solution - what's better for SEO?
Currently looking at adding a blog to our company site subdirectory and wanted to know if there was a SEO distinction between the following methods: Integrating a bolt-on blog tool with the site to create the blog VS. just using the current site infrastructure to build blog functionality. What's better for SEO? (and if tool integration is the overwhelming response - which tool?). Cheers.
Technical SEO | | Oxfordcomma0 -
Does Site Structure Affect Google
Hi - I'm pretty new at this. We’re running an e-commerce affiliate site at http://www.mydomain.com. So we don’t take payments but customer gets passed through to third party sites when they select to buy a product. We have a blog at http://www.mydomain.com/news. I think Google is treating these 2 sites as as separate sites for PR. For this reason we're thinking about moving this to http://news.mydomain.com. Anyone have any experience in this?
Technical SEO | | richardjoseph0 -
Should we use & or and in our url's?
Example: /Zambia/kasanka-&-bangweulu or /Zambia/kasanka-and-bangweulu which is the better url from the search engines point of view?
Technical SEO | | tribes0