How is Google finding our preview subdomains?
-
I've noticed that Google is able to find, crawl and index preview subdomains we set up for new client sites (e.g. clientpreview.example.com). I know now to use "meta name="robots" and robots.txt) to block the search engines from crawling these subdomains. My question though, is how is Google finding these subdomains? We don't link to these preview domains from anywhere else, so I can't figure out how Google is even getting there.
Does anybody have any insight on this?
-
Thanks for your response Irving. We put some of our preview sites on subdomains of our main domain, but then remove them after the site goes live, so their shouldn't be any duplicate content issues. The main question is just how Google is finding these subdomains.
-
Thanks for the insight guys.
-
I don't specifically use the Google Toolbar, but others in the office may (although I don't think so). It sounds like Chrome could be a potential source as well?
-
I think that this is a good idea. But you gotta be careful.
Our competitor (who ranked #1 and we ranked at #2) had their site redesigned and the design company included the noindex on every page. They forgot to take it off when the new design went live. It took them quite a while to figure it out and we enjoyed all of their sales for about a month.
We are #1 now and they are #2. Must have been a bad design job.
-
If the subdomains are added to WMT google will know about it. if you are designing sites for clients and putting them on your site as subdomains it behooves you to make sure 100% that their dev sites are not being seen by Google. It's duplicate content and your subdomain is the original source of this content. Looks unprofessional too
a) verify any subdomain you are creating for a client in WMT
b) block it in robots.txt and noindex nofollow all pages globally
c) for the ones that are already indexed, go into google WMT and go into that subdomain account and request removal of the site in Googles index. This will remove the indexing for that subdomain only don't worry it won't remove your main site from the index.
-
I would also consider adding a noindex tag if you want the urls removed.
-
I agree with Mat. You never know, but yes Chrome could be another major source. It also depends what you set as your privacy when you setup Chrome (Send anonymous usage data to Google, Yes/No ?) and so on.
-
We usually put them behind an .htaccess login now. We've had situations where the development site have been outranking the live site. Great demo of the power of on-site optimisation, but still a bit annoying for the client.
People used to always blame google toolbar for this. Likewise using chrome could potentially add something to the "to crawl" list. I wonder what the respective privacy policies say about that. I've also seen staging sites pick up links. When an external link on the staging site has been clicked it has alerted someone else, appeared as a link back/trackback etc.
-
The discovery can be from multiple mediums. Do you or the client have Google Toolbar installed ?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Pages (Wordpress)
Hello, recently I started noticing that google is not indexing our new pages or our new blog posts. We are simply getting a "Discovered - Currently Not Indexed" message on all new pages. When I click "Request Indexing" is takes a few days, but eventually it does get indexed and is on Google. This is very strange, as our website has been around since the late 90's and the quality of the new content is neither duplicate nor "low quality". We started noticing this happening around February. We also do not have many pages - maybe 500 maximum? I have looked at all the obvious answers (allowing for indexing, etc.), but just can't seem to pinpoint a reason why. Has anyone had this happen recently? It is getting very annoying having to manually go in and request indexing for every page and makes me think there may be some underlying issues with the website that should be fixed.
Technical SEO | | Hasanovic1 -
Get List Of All Indexed Google Pages
I know how to run site:domain.com but I am looking for software that will put these results into a list and return server status (200, 404, etc). Anyone have any tips?
Technical SEO | | InfinityTechnologySolutions0 -
Why wont google Index this page?
A week ago i accidentally changed this page settings in my CMS to "disable & dont index" as i was going to replace this page with another, but this didnt happen, but i forgot to switch the settings back! http://www.over50choices.co.uk/funeral-planning/funeral-plans Anyhow in an effort to get it back up quickly i submitted in GWTs but its still not indexed. When i use several SEO on page checking tools it has the Meta Title data as "Form" and not the correct title. Any ideas please? Yours frustrated Ash
Technical SEO | | AshShep10 -
Should we move our documentation off subdomain?
Background: We have a popular open source e-commerce platform at http://spreecommerce.com. Right now the documentation is on http://guides.spreecommerce.com. We have "edge" documentation (for stuff that's not yet released) on http://edgeguides.spreecommerce.com but since it's largely duplicative we've told google not to index any of the edge stuff (via robots.txt). Question: Should we consider moving the guides under the main website under /docs or something like this? There's a ton of great content that people often read to learn more about the platform. Seems like we might be diluting our juice a bit to have it on a separate domain. WDYT?
Technical SEO | | schof0 -
I was googling the word "best web hosting" and i notice the 1st and 3rd result were results with google plus. Does Google plus now play a role in improving ranking for the website?
I was googling the word "best web hosting" and i notice the 1st and 3rd result were results with google plus. Does Google plus now play a role in improving ranking for the website?I see a person's name next to the website too
Technical SEO | | mainguy0 -
Google plus
With Google search plus your world, would i see results ONLY from Google plus followers ? or from someone who is my facebook friend as well.
Technical SEO | | seoug_20050 -
How does Google determine freshness of content?
With the changes in the Google algorithm emphasizing freshness of content, I was wondering how they determine freshness and what constitutes new content. For instance, if I write a major update to a story I published last July, is the amended story fresh? Is there anything I can do in addition to publishing brand new content to make Google sure they see all my new content?
Technical SEO | | KnutDSvendsen0 -
Best blocking solution for Google
Posting this for Dave SottimanoI Here's the scenario: You've got a set of URLs indexed by Google, and you want them out quickly Once you've managed to remove them, you want to block Googlebot from crawling them again - for whatever reason. Below is a sample of the URLs you want blocked, but you only want to block /beerbottles/ and anything past it: www.example.com/beers/brandofbeer/beerbottles/1 www.example.com/beers/brandofbeer/beerbottles/2 www.example.com/beers/brandofbeer/beerbottles/3 etc.. To remove the pages from the index should you?: Add the Meta=noindex,follow tag to each URL you want de-indexed Use GWT to help remove the pages Wait for Google to crawl again If that's successful, to block Googlebot from crawling again - should you?: Add this line to Robots.txt: DISALLOW */beerbottles/ Or add this line: DISALLOW: /beerbottles/ "To add the * or not to add the *, that is the question" Thanks! Dave
Technical SEO | | goodnewscowboy0