Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Block a sub-domain from being indexed
-
This is a pretty quick and simple (i'm hoping) question. What is the best way to completely block a sub domain from getting indexed from all search engines?
One item i cannot use is the meta "no follow" tag.
Thanks! - Kyle
-
Keep in mind that Google Index's everything that it can crawl. Even if you put a block in the robots.txt they will probably crawl it. You can require a password to that subdomain and keep big G out. This is easy to do if you have a site with cpanel access. Just go to manage permissions, and password protect that director with a .htaccess pw.
-
The robots.txt file just tells the bots you would "prefer" they don't index but there is nothing to prevent them from indexing.The only sure way to do this is to restrict access to the sub-domain for everyone and require some sort of authentication. If they don't have access they can't index.
-
In subdomain.example.com/robots.txt add the statements:
User-agent: *
Disallow: /Warning: Be absolutely certain that the above statements are not included in your example.com/robots.txt file or you'll kill your site.
-
Each subdomain may have its own robots.txt file. So for that subdomain, you can put:
User-agent: * Disallow: /In the robots.txt, and that should do it.
Please note that disallowing pages in robots.txt will not necessarily mean they won't appear on search result pages.... if people link to pages that are disallowed on that subdomain, they can still appear in SERPs. I had this happen with a few pages, which leads to funny listings in the SERPs because Google has to guess what the page title and description of the page should be, since it's not allowed to read the page. The meta noindex tag is the way to go if you want to be really sure the page doesn't appear in the SERPs. If you use that, don't disallow the page. Here's a recent SEOMoz post about it: http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts
-
That was going to be my assumption but i wasn't 100% sure how they worked with sub domains. Are you able to supply a little more information on implementation? It is extremely important that it only blocks: sub.domain.com and not domain.com
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Not all images indexed in Google
Hi all, Recently, got an unusual issue with images in Google index. We have more than 1,500 images in our sitemap, but according to Search Console only 273 of those are indexed. If I check Google image search directly, I find more images in index, but still not all of them. For example this post has 28 images and only 17 are indexed in Google image. This is happening to other posts as well. Checked all possible reasons (missing alt, image as background, file size, fetch and render in Search Console), but none of these are relevant in our case. So, everything looks fine, but not all images are in index. Any ideas on this issue? Your feedback is much appreciated, thanks
Technical SEO | | flo_seo1 -
Images on sub domain fed from CDN
I have a client that uses a CDN to fill images, from a sub domain ( images.domain.com). We've made sure that the sub domain itself is not blocked. We've added a robots.txt file, we're creating an image sitemap file & we've verified ownership of the domain within GWT. Yet, any crawler that I use only see's the first page of the sub domain (which is .html) but none of the subsequent URL's which are all .jpeg. Is there something simple I'm missing here?
Technical SEO | | TammyWood0 -
Shopping Carts & Sub Domains
I was hoping someone could guide me in making the correct decision regarding integrating my existing domain with a hosted shopping cart. I have an existing website to promote my bricks and mortar retail operation and am expanding into web retailing. I will be using one of the major hosted shopping carts. What is the best way to join the two components from an SEO perspective? Have the cart as a sub domain of my main site, or move my existing domain name to be hosted by the cart provider and have both components operate under the same general domain? I have read arguments that putting your cart within a sub domain is not a good idea because any clout of the pre-existing domain will not be shared with the sub domain; that they will be treated as two separate sites. I have also read that using a sub domain is a good idea being that the content focus of the main domain (marketing and blogs) is different form the focus of the sub domain (product sales), and that the two components would benefit form earning their own rankings undiluted by the other. And, I have also read that search engines are getting good at being able to deduce that an eCommerce sub domain is legitimate extension of a content intensive main domain, and that they treat the two components as a combined whole. What is the truth? Which is the better way to go? Any guidance would be appreciated.
Technical SEO | | MEI1520 -
How to determine which pages are not indexed
Is there a way to determine which pages of a website are not being indexed by the search engines? I know Google Webmasters has a sitemap area where it tells you how many urls have been submitted and how many are indexed out of those submitted. However, it doesn't necessarily show which urls aren't being indexed.
Technical SEO | | priceseo1 -
What to do with 302 redirects being indexed
Hi there, Our site's forums include permalinks that for some reason uses an intermediary URL that 302 redirects to the URL with the permalink anchor. For example: http://en.tradimo.com/learn/chart-analysis/time-frames/ In the comments, there is a permalink to the following URL; en.tradimo.com/co/50c450005f2b949e3200001b/ (there is no content here, and never has been). This URL 302 redirects to the following final URL: http://en.tradimo.com/learn/chart-analysis/time-frames/?offset=0&limit=20#50c450005f2b949e3200001b The problem is, Google is indexing the redirect URL (en.tradimo.com/co/50c450005f2b949e3200001b/) and showing duplicate content even though we are using the nofollow tag on these links. Ideally, we would directly use the last link rather than redirecting. Alternatively, I'd say a 301 redirect would be preferable. But if both aren't available, is there a way to get these pages out of the index? Is the canonical tag the best way? I really wish I could just add /co/ to the robots.txt file, but I think they would still be in the index, right? Thanks for your help!
Technical SEO | | etruvian0 -
De-indexed from Google
Hi Search Experts! We are just launching a new site for a client with a completely new URL. The client can not provide any access details for their existing site. Any ideas how can we get the existing site de-indexed from Google? Thanks guys!
Technical SEO | | rikmon0 -
Checkout on different domain
Is it a bad SEO move to have a your checkout process on a separate domain instead of the main domain for a ecommerce site. There is no real content on the checkout pages and they are completely new pages that are not indexed in the search engines. Do to the backend architecture it is impossibe for us to have them on the same domain. An example is this page: http://www.printingforless.com/2/Brochure-Printing.html One option we've discussed to not pass page rank on to the checkout domain by iFraming all of the links to the checkout domain. We could also move the checkout process to a subdomain instead of a new domain. Please ignore the concerns with visitors security and conversion rate. Thanks!
Technical SEO | | PrintingForLess.com0 -
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like: staging.domain.com
Technical SEO | | fthead9
User-agent: *
Disallow: / in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.0