Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google Indexed a version of my site w/ MX record subdomain
-
We're doing a site audit and found "internal" links to a page in search console that appear to be from a subdomain of our site based on our MX record. We use Google Mail internally. The links ultimately redirect to our correct preferred subdomain "www", but I am concerned as to why this is happening and if it can have any negative SEO implications.
Example of one of the links:
Links aspmx3.googlemail.com.sullivansolarpower.com/about/solar-power-blog/daniel-sullivan/renewable-energy-and-electric-cars-are-not-political-footballs I did a site operator search, site:aspmx3.googlemail.com.sullivansolarpower.com on google and it returns several results.
-
You appear to have the MX sub-domain also set up as an A record.
If you have a mac / linux you can run the command: host aspmx3.googlemail.com.sullivansolarpower.com
You get the result aspmx3.googlemail.com.sullivansolarpower.com has address 72.10.48.198
Where you should get the result "not found".
I think you want to delete the A record (though check the documentation of your email provider first). You should only need them set up as MX records and shouldn't need the A record.
You've done the right thing by setting up the redirect - which should mean that the pages drop out of the index and those links disappear. (Note that there is also an https error on the aspmx3 sub-domain - but given that you don't actually want it, I don't suppose that matters that much).
Hope that helps.
-
I did not explain the problem thoroughly. The problem is, the link does not actually exist anywhere. To make a very long story short. There was an issue with server configuration for a period of a couple months. During that time, an unknown number of non-existent subdomains got indexed. Basically, if anyone had a typo in the subdomain when accessing our site, it would get cached and if Google crawled our site before we cleared the cache, the typo subdomain would get indexed. Over a period of a couple months, many bad subdomains were accidentally created and indexed by Google. We do not have any way of finding a comprehensive list of all of them. This problem has been resolved so we are not getting new bad subdomains created and indexed, but the damage has been done.
The way our site is setup currently, any attempt to reach our site with any subdomain other than "www" gets redirected to "www.sullivan..." Also, any nonsecure protocol gets resolved to https://
The actual problem, simply put is this: Google has an index which includes some number of unknown, non existent subdomains. We need to get rid of them and cannot figure out how.
Example: Copy and paste the following into google and search it:
site:aspmx3.googlemail.com.sullivansolarpower.com
Google will return two results. If you click on either, it resolves to the "https://www. version of the page.
I know it is confusing, but does that make sense? I have searched everywhere, but the reason this happened was because of a perfect storm of server configuration issues and I cannot find anyone else who has had the same problem.
If it were one or two bad subdomains, we would just put them into search console and then get "remove URL" for the entire subdomain. But it is not 1 or 2. It is at least 10 that I know of and could be hundreds for all I know.
Does anyone have any ideas? Any and all would be welcome.
Thank you.
-
You should find the locations of those links and correct them to point to the proper URL. I find that Screaming Frog's crawl is the easiest for this, you can find every link and see where they are located.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does Google index internal anchors as separate pages?
Hi, Back in September, I added a function that sets an anchor on each subheading (h[2-6]) and creates a Table of content that links to each of those anchors. These anchors did show up in the SERPs as JumpTo Links. Fine. Back then I also changed the canonicals to a slightly different structur and meanwhile there was some massive increase in the number of indexed pages - WAY over the top - which has since been fixed by removing (410) a complete section of the site. However ... there are still ~34.000 pages indexed to what really are more like 4.000 plus (all properly canonicalised). Naturally I am wondering, what google thinks it is indexing. The number is just way of and quite inexplainable. So I was wondering: Does Google save JumpTo links as unique pages? Also, does anybody know any method of actually getting all the pages in the google index? (Not actually existing sites via Screaming Frog etc, but actual pages in the index - all methods I found sadly do not work.) Finally: Does somebody have any other explanation for the incongruency in indexed vs. actual pages? Thanks for your replies! Nico
Technical SEO | | netzkern_AG0 -
How to check if an individual page is indexed by Google?
So my understanding is that you can use site: [page url without http] to check if a page is indexed by Google, is this 100% reliable though? Just recently Ive worked on a few pages that have not shown up when Ive checked them using site: but they do show up when using info: and also show their cached versions, also the rest of the site and pages above it (the url I was checking was quite deep) are indexed just fine. What does this mean? thank you p.s I do not have WMT or GA access for these sites
Technical SEO | | linklander0 -
Robots.txt to disallow /index.php/ path
Hi SEOmoz, I have a problem with my Joomla site (yeah - me too!). I get a large amount of /index.php/ urls despite using a program to handle these issues. The URLs cause indexation errors with google (404). Now, I fixed this issue once before, but the problem persist. So I thought, instead of wasting more time, couldnt I just disallow all paths containing /index.php/ ?. I don't use that extension, but would it cause me any problems from an SEO perspective? How do I disallow all index.php's? Is it a simple: Disallow: /index.php/
Technical SEO | | Mikkehl0 -
How does Google Crawl Multi-Regional Sites?
I've been reading up on this on Webmaster Tools but just wanted to see if anyone could explain it a bit better. I have a website which is going live soon which is going to be set up to redirect to a localised URL based on the IP address i.e. NZ IP ranges will go to .co.nz, Aus IP addresses would go to .com.au and then USA or other non-specified IP addresses will go to the .com address. There is a single CMS installation for the website. Does this impact the way in which Google is able to search the site? Will all domains be crawled or just one? Any help would be great - thanks!
Technical SEO | | lemonz0 -
Does google use the wayback machine to determine the age of a site?
I have a site that I had removed from the wayback machine because I didn't want old versions to show. However I noticed that in many seo tools the site now always shows a domain age of zero instead of 6 years ago when I registered it. My question is what do the actual search engines use to determine age when they factor it into the ranking algorithm? By having it removed from the wayback machine, does that make the search engines think the site is brand new? Thanks
Technical SEO | | FastLearner0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0 -
Google.ca is showing our US site instead of our Canada Site
When our Canadian users who search on google.ca for our brand (e.g. Travelocity, Travelocity hotels, etc.), the first few results our from our US site (travelocity.com) rather than our Canadian site (travelocity.ca). In Google Webmaster Tools, we've adjusted the geotargeting settings to focus on the appropriate locale, but the wrong country TLD is still coming up at the top via google.ca. What's the best way to ensure our Canadian site comes up instead of the US site on google.ca? Thanks, Tory Smith
Technical SEO | | travelocitysearch
Travelocity0