Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Removed Subdomain Sites Still in Google Index
-
Hey guys,
I've got kind of a strange situation going on and I can't seem to find it addressed anywhere. I have a site that at one point had several development sites set up at subdomains. Those sites have since launched on their own domains, but the subdomain sites are still showing up in the Google index. However, if you look at the cached version of pages on these non-existent subdomains, it lists the NEW url, not the dev one in the little blurb that says "This is Google's cached version of www.correcturl.com." Clearly Google recognizes that the content resides at the new location, so how come the old pages are still in the index? Attempting to visit one of them gives a "Server Not Found" error, so they are definitely gone.
This is happening to a couple of sites, one that was launched over a year ago so it doesn't appear to be a "wait and see" solution.
Any suggestions would be a huge help. Thanks!!
-
Right. I get that they don't exist on your site currently, but when they did Google indexed them so they exist in some form within Google, but Google had never been told they had permanently moved (via 301). The good news is that you don't have to resurrect the entire site. You can simply modify the appropriate file (htaccess if you're on Apache, IIS if Window's server) and make certain that Google knows any page it's looking for at devsite.yoursite.com is now at www.correcturl.com. Cheers!
-
Ryan,
Thanks for your quick response! The reason we aren't doing 301s or noindex on these sites is that they no longer exist. We would have to essentially resurrect these dev sites for the sole purpose of redirecting. Since Google's cached version is the new/current url wouldn't that imply that they are aware of the change and the subdomains are hanging around for another reason?
We typically noindex dev sites but a couple of them slipped by without.
-
Hi Sarah. Have you put in 301 redirects in the htaccess file for these subdomains? You may want to consider going through the change of address tool in Google Webmaster Tools as well. The problem seems to be that Google crawled and indexed the old subdomains and still has references to the old pages that existed on them. Ultimately using NOINDEX on development sites and then using a catchall 301 redirect should help clean this up for you. Cheers!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google not Indexing images on CDN.
My URL is: https://bit.ly/2hWAApQ We have set up a CDN on our own domain: https://bit.ly/2KspW3C We have a main xml sitemap: https://bit.ly/2rd2jEb and https://bit.ly/2JMu7GB is one the sub sitemaps with images listed within. The image sitemap uses the CDN URLs. We verified the CDN subdomain in GWT. The robots.txt does not restrict any of the photos: https://bit.ly/2FAWJjk. Yet, GWT still reports none of our images on the CDN are indexed. I ve followed all the steps and still none of the images are being indexed. My problem seems similar to this ticket https://bit.ly/2FzUnBl but however different because we don't have a separate image sitemap but instead have listed image urls within the sitemaps itself. Can anyone help please? I will promptly respond to any queries. Thanks
Technical SEO | | TNZ
Deepinder0 -
How to remove Parameters from Google Search Console?
Hi All, Following are parameter configuration in search console - Parameters - fl
Technical SEO | | adamjack
Does this parameter change page content seen by the user? - Yes, Changes, reorders, or narrows page content.
How does this parameter affect page content? - Narrow
Which URLs with this parameter should Googlebot crawl? - Let Googlebot decide (Default) Query - Actually it is filter parameter. I have already set canonical on filter page. Now I am doing tracking of filter pages via data layer and tag manager so in google analytic I am not able to see filter url's because of this parameter. So I want to delete this parameter. Can anyone please help me? Thanks!0 -
Removing a large number of unnecessary pages from a site
Hi all, I got a big problem with my website. I have a lot of page, duplicate page made from various combinations of selects, and for all this duplicate content we've be hit by a panda update 2 years ago. I don't want to bring new content an all of these pages, about 3.000.000, because most of them are unnecessary. Google indexed all of them (3.000.000), and I want to redirect the pages that I don't need anymore to the most important ones. My question, is there any problem in how google will see this change, because after this it will remain only 5000-6000 relevant pages?
Technical SEO | | Silviu0 -
How to Stop Google from Indexing Old Pages
We moved from a .php site to a java site on April 10th. It's almost 2 months later and Google continues to crawl old pages that no longer exist (225,430 Not Found Errors to be exact). These pages no longer exist on the site and there are no internal or external links pointing to these pages. Google has crawled the site since the go live, but continues to try and crawl these pages. What are my next steps?
Technical SEO | | rhoadesjohn0 -
CDN Being Crawled and Indexed by Google
I'm doing a SEO site audit, and I've discovered that the site uses a Content Delivery Network (CDN) that's being crawled and indexed by Google. There are two sub-domains from the CDN that are being crawled and indexed. A small number of organic search visitors have come through these two sub domains. So the CDN based content is out-ranking the root domain, in a small number of cases. It's a huge duplicate content issue (tens of thousands of URLs being crawled) - what's the best way to prevent the crawling and indexing of a CDN like this? Exclude via robots.txt? Additionally, the use of relative canonical tags (instead of absolute) appear to be contributing to this problem as well. As I understand it, these canonical tags are telling the SEs that each sub domain is the "home" of the content/URL. Thanks! Scott
Technical SEO | | Scott-Thomas0 -
Is Google caching date same as crawling/indexing date?
If a site is cached on say 9 oct 2012 doesn't that also mean that Google crawled it on same date ? And indexed it on same date?
Technical SEO | | Personnel_Concept0 -
How does Google Crawl Multi-Regional Sites?
I've been reading up on this on Webmaster Tools but just wanted to see if anyone could explain it a bit better. I have a website which is going live soon which is going to be set up to redirect to a localised URL based on the IP address i.e. NZ IP ranges will go to .co.nz, Aus IP addresses would go to .com.au and then USA or other non-specified IP addresses will go to the .com address. There is a single CMS installation for the website. Does this impact the way in which Google is able to search the site? Will all domains be crawled or just one? Any help would be great - thanks!
Technical SEO | | lemonz0 -
Why are old versions of images still showing for my site in Google Image Search?
I have a number of images on my website with a watermark. We changed the watermark (on all of our images) in May, but when I search for my site getmecooking in Google Image Search, it still shows the old watermark (the old one is grey, the new one is orange). Is Google not updating the images its search results because they are cached in Google? Or because it is ignoring my images, having downloaded them once? Should we be giving our images a version number (at the end of the file name)? Our website cache is set to 7 days, so that's not the issue. Thanks.
Technical SEO | | Techboy0