Blocking Subdomain from Google Crawl and Index
-
Hey everybody, how is it going?
I have a simple question, that i need answered.
I have a main domain, lets call it domain.com. Recently our company will launch a series of promotions for which we will use cname subdomains, i.e try.domain.com, or buy.domain.com. They will serve a commercial objective, nothing more.
What is the best way to block such domains from being indexed in Google, also from counting as a subdomain from the domain.com. Robots.txt, No-follow, etc?
Hope to hear from you,
Best Regards,
-
Hello George, Thank you for fast answer! I read that article and there is some issue with that. if you can see at it, i'd really appreciate it. So the problem is that if i do it directly from Tumblr, it will also block it from Tumblr users. Here is the note right below that option "Allow this blog to appear in search results":
"This applies to searches on Tumblr as well as external search engines, like Google or Yahoo."Also, if i do it from GWT, i'm very concerned to remove URLs with my subdomain because i afraid it will remove all my domain. For example, my domain is abc.com and the Tumblr blog is setup on tumblr.abc.com. So i afraid if i remove tumblr.abc.com from index, it will also remove my abc.com. Please let me know what you think.
Thank you!
-
Hi Marina,
If I understand your question correctly, you just don't want your Tumblr blog to be indexed by Google. In which case these steps will help: http://yourbusiness.azcentral.com/keep-tumblr-off-google-3061.html
Regards,
George
-
Hi guys, I read your conversation. I have similar issue but my situation is slightly different. I'll really appreciate if you can help with this. So i have also a subdomain that i don't want to be indexed by Google. However, that subdomain is not in my control. I mean, i created subdomain on my hosting but it is pointing to my Tumblr blog. So i don't have access to its robot txt. So can anybody advise what can i do in this situation to noindex that subdomain?
Thanks
-
Personally I wouldn't rely just on robots.txt, as one accidental, public link to any of the pages (easier than you may think!) will result in Google indexing that subdomain page (it just won't be followed). This means that the page can get "stuck" in Google's index and to resolve it you would need to remove it using WMT (instructions here). If there were a lot of pages accidentally indexed, you would need to remove the robots.txt restriction so Google can crawl it, and put a noindex/nofollow tags on the page so Google drops it from its index.
To cut a long story short, I would do both Steps 1 and 2 outlined by Federico if you want to sleep easy at night :).
George
-
It would also be smart to add the subdomains in Webmaster Tools in case one does get indexed and you need to remove it.
-
Robots.txt is easiest and quickest way. As a back up you can use the Noindex meta tag on the pages in the subdomain
-
2 ways to do it with different effects:
-
Robots.txt in each subdomain. This will entirely block any search engine to even access those pages, so they won't know what they have inside.
User-Agent:*
Disallow: /
-
noindex tags in those pages. This method allows crawlers to read the page and maybe index (if you set a "follow") the pages to which you link to.or "nofollow" if you don't want the linked pages to be indexed either.
Hope that helps!
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When making content pages to a specific page; should you index it straight away in GSC or let Google crawl it naturally?
When making content pages to a specific page; should you index it straight away in GSC or let Google crawl it naturally?
On-Page Optimization | | Jacksons_Fencing0 -
Site no indexed after a week loadbalancer, cache-control?
hi All apologies to everybody in advance my level of SEO or technical stuff is quite minimal. we have setup a new site with a top level domain sitting on multiple servers ( https.www.mysite/new/ on one and http.wwww.mysites/old/ on another) different pages,content etc. it's definitely a patchy solution until what is left in the old server will be migrated to the new one. however the new site is still not indexed after around 2 weeks. I have checke on moz and and google console ( fetch and rendering) and nothing seem to block indexing. no issue with robot.txt ( either on https.www.mysite/new/robots.txt or http.wwww.mysites/old/robots.txt) meta-robot. any idea on what could cause the problem? can it be an issue with the cache-control set as : no-store no-cache? Or the loadbalancer that is preventing google to access ? Dario
On-Page Optimization | | Mrlocicero0 -
Google Stopped Displaying Rich Snippets Stars
Background Google stopped displaying stars next to the website "szallas.hu" in the SERPs, eg: http://goo.gl/RXSS58 Google used to display the stars there is no technical problem, since I can see the stars in the Rich Snippet Testing Tool: http://goo.gl/bQUSNr Why did Google stop showing the stars? What can I do? I already filled in a form at https://support.google.com/webmasters/contact/rich_snippets_feedback.
On-Page Optimization | | Sved0 -
Google Webmaster Tools - Inappropriate Keywords
In December our website was the victim of a devastating pharma hack. Google Webmaster Tools never reported malware on our website, but our keywords within Webmaster Tools were all changed to keywords related to mens and ladies watches. Our Google organic search rank and resulting traffic has been dramatically impacted. Since realizing we were hacked, we rebuilt our website from the ground up with a clean and updated installation of Joomla CMS (from 1.5 to 2.5) and changed to a more secure server. As of December 21st, 2012, our website has been clean of the phama hack. We wrote Google and told them what happened. They replied with a "no manual spam actions were found" message. Unfortunately, Google Webmaster Tools still reflects all of the hacked keywords for mens and ladies watches. We are watching our Google Webmaster Tools everyday waiting for Google to remove the hacked keywords. It's been a little over a month and still only about 5 of the first 100 keywords have anything to do with our website. What do you recommend we do to expedite the removal of these inappropriate keywords in Google WMT?
On-Page Optimization | | TCM-SEO0 -
Prevent Indexing of URLs Based on Tags
I started my website as a blog over at Posterous, but decided to turn it into a full scale business website with a self-hosted WordPress theme. Shortly after transitioning from Posterous to WordPress, I noticed that Google was indexing not only my old blog posts, but the URLs of my blog posts based on the tags they have. Is there any reason why this is a problem? I'm sure it shouldn't qualify as duplicate content, but for some reason it just feels a bit sloppy to me to have all of these pages indexed...Is this a non-issue? Should I just be more discriminating with my use of 'tags' if it bothers me? JiGLH.png
On-Page Optimization | | williammarlow0 -
How does a keyword get crawled and pointed at a certain page
I was wondering if you can give me some insight on how a keyword that I put on my campaign gets linked to a specific URL on my website by SEOMoz or Google. For example: updating a brick fireplace is my keyword. On the campaign when I am looking at my on page optimization, the URL assigned (or given) to it is my homepage. How is this determined and is there a way around it and or directing it to the correct page? Thanks
On-Page Optimization | | SammyT0 -
How do you block development servers with robots.txt?
When we create client websites the urls are client.oursite.com. Google is indexing theses sites and attaching to our domain. How can we stop it with robots.txt? I've heard you need to have the robots file on both the main site and the dev sites... A code sample would be groovy. Thanks, TR
On-Page Optimization | | DisMedia0