What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
-
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like:
staging.domain.com
User-agent: *
Disallow: /in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.
-
Just make sure that when/if you copy over the staging site to the live domain that you don't copy over the robots.txt, htaccess, or whatever means you use to block that site from being indexed and thus have your shiny new site be blocked.
-
I agree. The name of your subdomain being "staging" didn't register at all with me until Matt brought it up. I was offering a generic response to the subdomain question whereas I believe Matt focused on how to handle a staging site. Interesting viewpoint.
-
Matt/Ryan-
Great discussion, thanks for the input. The staging.domain.com is just one of the domains we don't want indexed. Some of them still need to be accessed by the public, some like staging could be restricted to specific IPs.
I realize after your discussion I probably should have used a different example of a sub-domain. On the other hand it might not have sparked the discussion so maybe it was a good example
-
.htaccess files can be placed at any directory level of a site so you can do it for just the subdomain or even just a directory of a domain.
-
Staging URL's are typically only used for testing so rather than do a deny I would recommend using a specific ALLOW for only the IP addresses that should be allowed access.
I would imagine you don't want it indexed because you don't want the rest of the world knowing about it.
You can also use HTACCESS to use username/passwords. It is simple but you can give that to clients if that is a concern/need.
-
Correct.
-
Toren, I would not recommend that solution. There is nothing to prevent Googlebot from crawling your site via almost any IP. If you found 100 IPs used by the crawler and blocked them all, there is nothing to stop the crawler from using IP #101 next month. Once the subdomain's content is located and indexed, it will be a headache fixing the issue.
The best solution is always going to be a noindex meta tag on the pages you do not wish to be indexed. If that method is too much work or otherwise undesirable, you can use the robots.txt solution. There is no circumstance I can imagine where you would modify your htaccess file to block googlebot.
-
Hi Matt.
Perhaps I misunderstood the question but I believe Toren only wishes to prevent the subdomain from being indexed. If you restrict subdomain access by IP it would prevent visitors from accessing the content which I don't believe is the goal.
-
Interesting, hadn't thought of using htaccess to block Googlebot.Thanks for the suggestion.
-
Thanks Ryan. So you don't see any issues with de-indexing the main site if I created a second robots.txt file, e.g.
http://staging.domin.com/robots.txt
User-agent: *
Disallow: /That was my initial thought but when Google announced they consider sub-domains part of the TLD I was afraid it might affect the htp://www.domain.com versions of the pages. So you're saying the subdomain is basically treated like a folder you block on the primary domain?
-
Use an .htaccess file to only allow from certain ip addresses or ranges.
Here is an article describing how: http://www.kirupa.com/html5/htaccess_tricks.htm
-
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Place a robots.txt file in the root of the subdomain.
User-agent: *
Disallow: /This method will block the subdomain while leaving your primary domain unaffected.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google is indexing our old domain
We changed our primary domain from vivitecsolutions.com to vivitec.net. Google is indexing our new domain, but still has our old domain indexed too. The problem is that the old site is timing out because of the https: Thought on how to make the old indexing go away or properly forward the https?
Technical SEO | | AdsposureDev0 -
Moving to old site to new domain sub directory
Hi, we've moved our old site to a new domain but in a subdirectory (the shopping site has been consolidated into overarching company website's shopping section, thus the move to sub dir). Are 301 redirects from old URLs to new domain's subdirectory ex newsite.com/shopping/page-1/ sufficient for site migration? I wasn't able to use Google's site address change tool since we're moving to a subdirectory on the new domain. Thanks
Technical SEO | | SoulSurfer80 -
Why blocking a subfolder dropped indexed pages with 10%?
Hy Guys, maybe you can help me to understand better: on 17.04 I had 7600 pages indexed in google (WMT showing 6113). I have included in the robots.txt file, Disallow: /account/ - which contains the registration page, wishlist, etc. and other stuff since I'm not interested to rank with registration form. on 23.04 I had 6980 pages indexed in google (WMT showing 5985). I understand that this way I'm telling google I don't want that section indexed, by way so manny pages?, Because of the faceted navigation? Cheers
Technical SEO | | catalinmoraru0 -
Multiple sub domain appearing
Hi Everyone, Hope were well!. Have a strange one!!. New clients website http://www.allsee-tech.com. Just found out he is appearing for every subdomain possible. a.alsee-tech.com b.allsee-tech.com. I have requested htaccess as this is where I think the issue lies but he advises there isn't anything out of place there. Any ideas in case it isn't? Regards Neil
Technical SEO | | nezona0 -
Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)
Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Issue with .uk.com domain
hi i have rockshore.uk.com which is not indexing properly. the internal pages do not show up for the text they have on them, or the title tags. the site is on aekmps shops platform. I understand that a .uk.com is not a proper TLD but i think i have a subdomain of .uk.com Can anyone help? thanks
Technical SEO | | Turkey0 -
What are the best techniques for sub-menu?
Which techniques are "SEO-Friendly" for creating a sub-menu when you have to go hover a menu to discover the sub-menu? Best regards, Jonathan
Technical SEO | | JonathanLeplang0 -
Snagged an Expired Domain Best way to get all the Link Juce?
Found a PR4 domain that had expired and not been renewed. Best way to get all the link juice from it? just 301 the whole thing to MY main domain?
Technical SEO | | bozzie3110