How is Google finding our preview subdomains?
-
I've noticed that Google is able to find, crawl and index preview subdomains we set up for new client sites (e.g. clientpreview.example.com). I know now to use "meta name="robots" and robots.txt) to block the search engines from crawling these subdomains. My question though, is how is Google finding these subdomains? We don't link to these preview domains from anywhere else, so I can't figure out how Google is even getting there.
Does anybody have any insight on this?
-
Thanks for your response Irving. We put some of our preview sites on subdomains of our main domain, but then remove them after the site goes live, so their shouldn't be any duplicate content issues. The main question is just how Google is finding these subdomains.
-
Thanks for the insight guys.
-
I don't specifically use the Google Toolbar, but others in the office may (although I don't think so). It sounds like Chrome could be a potential source as well?
-
I think that this is a good idea. But you gotta be careful.
Our competitor (who ranked #1 and we ranked at #2) had their site redesigned and the design company included the noindex on every page. They forgot to take it off when the new design went live. It took them quite a while to figure it out and we enjoyed all of their sales for about a month.
We are #1 now and they are #2. Must have been a bad design job.
-
If the subdomains are added to WMT google will know about it. if you are designing sites for clients and putting them on your site as subdomains it behooves you to make sure 100% that their dev sites are not being seen by Google. It's duplicate content and your subdomain is the original source of this content. Looks unprofessional too
a) verify any subdomain you are creating for a client in WMT
b) block it in robots.txt and noindex nofollow all pages globally
c) for the ones that are already indexed, go into google WMT and go into that subdomain account and request removal of the site in Googles index. This will remove the indexing for that subdomain only don't worry it won't remove your main site from the index.
-
I would also consider adding a noindex tag if you want the urls removed.
-
I agree with Mat. You never know, but yes Chrome could be another major source. It also depends what you set as your privacy when you setup Chrome (Send anonymous usage data to Google, Yes/No ?) and so on.
-
We usually put them behind an .htaccess login now. We've had situations where the development site have been outranking the live site. Great demo of the power of on-site optimisation, but still a bit annoying for the client.
People used to always blame google toolbar for this. Likewise using chrome could potentially add something to the "to crawl" list. I wonder what the respective privacy policies say about that. I've also seen staging sites pick up links. When an external link on the staging site has been clicked it has alerted someone else, appeared as a link back/trackback etc.
-
The discovery can be from multiple mediums. Do you or the client have Google Toolbar installed ?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Subdomain or subfolder?
Hello, We are working on a new site. The idea of the site is to have an ecommerce shop, but the homepage will be a content page, basically a blog page.
Technical SEO | | pinder325
My developer wants to have the blog (home) page on a subdomain, so blog.example.com, because it will be easier to make a nice content page this way, and the the rest of the site will just be on the root domain (example.com). I'm just worried that this will be bad for our SEO efforts. I've always thought it was better to use a sub folder rather than a subdomain. If we get links to the content on the subdomain, will the link juice flow to the shop, on the root domain? What are your thoughts?0 -
Why Google crawl parameter URLs?
Hi SEO Masters, Google is indexing this parameter URLs - 1- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-6109-4191-6110&mode=li_23&p=2&filterable_stone_shapes=4114 2- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-4169-4195&mode=li_23&p=2&filterable_stone_shapes=4115&filterable_metal_types=4163 I have handled by Google parameter like this - jewelry_styles= Narrows Let Googlebot decide mode= None Representative URL p= Paginates Let Googlebot decide filterable_stone_shapes= Narrows Let Googlebot decide filterable_metal_types= Narrows Let Googlebot decide and Canonical for both pages - xyz.com/f1/f2/page?p=2 So can you suggest me why Google indexed all related pages with this - xyz.com/f1/f2/page?p=2 But I have no issue with first page - xyz.com/f1/f2/page (with any parameter). Cononical of first page is working perfectly. Thanks
Technical SEO | | Rajesh.Prajapati
Rajesh0 -
Broken Instant Preview on SERPS
When I check on my SERPS the preview is all mixed up. The images are over each other and it is all like broken. What could cause the problem? Thank you very much in advance for any help! The website is http://villasdiani.com
Technical SEO | | VillasDiani0 -
How can I optimise for Google Products?
Has anyone got experience of optimising Google Products (Google Base) feeds? I've noticed that, although my site doesn't often appear on page one in the standard results, we occasionally appear right at the top because of the "universal" shopping results. My question is: how can we make this happen more often? There seems to be a lot less competition (presumably because our competitors haven't worked out how to provide the feed to Google yet!), so I imagine it should be easier and quicker to reach the top this way than any other way. Thanks! Alex
Technical SEO | | reddogmusic0 -
Pages not indexed by Google
We recently deleted all the nofollow values on our website. (2 weeks ago) The number of pages indexed by google is the same as before? Do you have explanations for this? website : www.probikeshop.fr
Technical SEO | | Probikeshop0 -
0 Google Backlinks
A sudden drop in the number of google backlinks. Earlier this month I had 15 google backlinks and now all of a sudden I have none. My google impression has also dropped drastically, my website's average is 10000 impression per day and now we have none. I have increased the crawler's speed on the website, would this be the cause of it? 4pmdesign.com
Technical SEO | | 4pm0 -
How a google bot sees your site
So I have stumbled across various websites like this: http://www.smart-it-consulting.com/internet/google/googlebot-spoofer/ The concept here is to be able to view your site as a googlebot sees it. However, the results are a little puzzling. Google is reading the text on my page but not the title tags according to the results. Are websites like this accurate OR does Google not read title tags and H1 tags anymore? Also on a slighly related note. I noticed the results show the navigation bar is being read first by google, is this bad and should the navigation bar be optimized for keywords as well? If it did, it would read a bit funny and the "humans" would be confused.
Technical SEO | | StreetwiseReports0 -
How to disallow google and roger?
Hey Guys and girls, i have a question, i want to disallow all robots from accessing a certain root link: Get rid of bots User-agent: * Disallow: /index.php?_a=login&redir=/index.php?_a=tellafriend%26productId=* Will this make the bots not to access any web link that has the prefix you see before the asterisk? And at least google and roger will get away by reading "user-agent: *"? I know this isn't the standard proceedure but if it works for google and seomoz bot we are good.
Technical SEO | | iFix0