Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
-
Now that Google considers subdomains as part of the TLD I'm a little leery of testing robots.txt with something like:
staging.domain.com
User-agent: *
Disallow: /in fear it might get the www.domain.com blocked as well. Has anyone had any success using robots.txt to block sub-domains? I know I could add a meta robots tag to the staging.domain.com pages but that would require a lot more work.
-
Just make sure that when/if you copy over the staging site to the live domain that you don't copy over the robots.txt, htaccess, or whatever means you use to block that site from being indexed and thus have your shiny new site be blocked.
-
I agree. The name of your subdomain being "staging" didn't register at all with me until Matt brought it up. I was offering a generic response to the subdomain question whereas I believe Matt focused on how to handle a staging site. Interesting viewpoint.
-
Matt/Ryan-
Great discussion, thanks for the input. The staging.domain.com is just one of the domains we don't want indexed. Some of them still need to be accessed by the public, some like staging could be restricted to specific IPs.
I realize after your discussion I probably should have used a different example of a sub-domain. On the other hand it might not have sparked the discussion so maybe it was a good example
-
.htaccess files can be placed at any directory level of a site so you can do it for just the subdomain or even just a directory of a domain.
-
Staging URL's are typically only used for testing so rather than do a deny I would recommend using a specific ALLOW for only the IP addresses that should be allowed access.
I would imagine you don't want it indexed because you don't want the rest of the world knowing about it.
You can also use HTACCESS to use username/passwords. It is simple but you can give that to clients if that is a concern/need.
-
Correct.
-
Toren, I would not recommend that solution. There is nothing to prevent Googlebot from crawling your site via almost any IP. If you found 100 IPs used by the crawler and blocked them all, there is nothing to stop the crawler from using IP #101 next month. Once the subdomain's content is located and indexed, it will be a headache fixing the issue.
The best solution is always going to be a noindex meta tag on the pages you do not wish to be indexed. If that method is too much work or otherwise undesirable, you can use the robots.txt solution. There is no circumstance I can imagine where you would modify your htaccess file to block googlebot.
-
Hi Matt.
Perhaps I misunderstood the question but I believe Toren only wishes to prevent the subdomain from being indexed. If you restrict subdomain access by IP it would prevent visitors from accessing the content which I don't believe is the goal.
-
Interesting, hadn't thought of using htaccess to block Googlebot.Thanks for the suggestion.
-
Thanks Ryan. So you don't see any issues with de-indexing the main site if I created a second robots.txt file, e.g.
http://staging.domin.com/robots.txt
User-agent: *
Disallow: /That was my initial thought but when Google announced they consider sub-domains part of the TLD I was afraid it might affect the htp://www.domain.com versions of the pages. So you're saying the subdomain is basically treated like a folder you block on the primary domain?
-
Use an .htaccess file to only allow from certain ip addresses or ranges.
Here is an article describing how: http://www.kirupa.com/html5/htaccess_tricks.htm
-
What is the best method to block a sub-domain, e.g. staging.domain.com/ from getting indexed?
Place a robots.txt file in the root of the subdomain.
User-agent: *
Disallow: /This method will block the subdomain while leaving your primary domain unaffected.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How can you promote a sub-domain ahead of a domain on the SERPs?
I have a new client that wants to promote their subdomain uk.imagemcs.com and have their main domain imagemcs.com fall off the SERPs. Objective? Get uk.imagemcs.com to rank first for UK 'brand' searches. Do a search for 'imagem creative services' and you should see the issue (it looks like rules have been applied to the robots.txt on the main domain to exclude any bots from crawling - but since they've been indexed previously I need to take action as it doesn't look great!). I think I can do this by applying a permanent redirect from the main domain to the subdomain at domain level and then no-indexing the site - and then resubmit the sitemap. My slight concern is that this no-indexing of the main domain may impact on the visibility of the subdomains (I'm dealing with uk.imagemcs.com, but there is us.imagemcs.com and de.imagemcs.com) and was looking for some assurance that this would not be the case. My understanding is that subdomains are completely distinct from domains and as such this action should have no impact on the subdomains. I asked the question on the Webmasters Forum but haven't really got anywhere
Technical SEO | | nathangdavidson2
https://productforums.google.com/forum/#!msg/webmasters/1Avupy3Uw_o/hu6oLQntCAAJ Can anyone suggest a course of action? many thanks, Nathan0 -
Help: buy domain from Tradenames.com?
Hello to all, I'm Silvia. I am writing to ask if any of you know this site: tradenames.com. It is a domains broker. They contacted my client and would like to sell the .com business domain (my client currently has the .it). Does anyone know them? Thanks you for your help.
Technical SEO | | advmedialab0 -
Duplicate content issue: staging urls has been indexed and need to know how to remove it from the serps
duplicate content issue: staging url has been indexed by google ( many pages) and need to know how to remove them from the serps. Bing sees the staging url as moved permanently Google sees the staging urls (240 results) and redirects to the correct url Should I be concerned about duplicate content and request Google to remove the staging url removed Thanks Guys
Technical SEO | | Taiger0 -
Blocked URL parameters can still be crawled and indexed by google?
Hy guys, I have two questions and one might be a dumb question but there it goes. I just want to be sure that I understand: IF I tell webmaster tools to ignore an URL Parameter, will google still index and rank my url? IS it ok if I don't append in the url structure the brand filter?, will I still rank for that brand? Thanks, PS: ok 3 questions :)...
Technical SEO | | catalinmoraru0 -
Image Height/Width attributes, how important are they and should a best practice site include this as std
Hi How important are the image height/width attributes and would you expect a best practice site to have them included ? I hear not having them can slow down a page load time is that correct ? Any other issues from not having them ? I know some re social sharing (i know bufferapp prefers images with h/w attributes to draw into their selection of image options when you post) Most importantly though would you expect them to be intrinsic to sites that have been designed according to best practice guidelines ? Thanks
Technical SEO | | Dan-Lawrence0 -
Disallow: /404/ - Best Practice?
Hello Moz Community, My developer has added this to my robots.txt file: Disallow: /404/ Is this considered good practice in the world of SEO? Would you do it with your clients? I feel he has great development knowledge but isn't too well versed in SEO. Thank you in advanced, Nico.
Technical SEO | | niconico1011 -
Tutorial For Moving Blogger Blog From Sub-Domain to Sub-Directory
Does anyone know where I can find a tutorial for moving a blogger.com (blogspot) blog that's currently hosted on a subdomain (i.e. blog.mysite.com) to a subdirectory (i.e. mysite.com/blog) with the current version of blogger? I'm working on transferring my blogger blogs over to wordpress, and to do so without losing link juice or traffic, this is one of the steps I have to take. There's plenty of tutorials that address moving from blogspot.mysite.com to wordpress and I've even found a few that address moving from blog.mysite.com (hosted on blogger) to a root domain mysite.com. However, I need to move from blog.mysite.com (blogger) to mysite.com/blog/ - subdirectory (wordpress). Anyone who knows how to do this or can point me in the right direction?? Thanks.
Technical SEO | | ChaseH0 -
Why is a 301 redirected url still getting indexed?
We recently fixed a redirect issue in a website, and although it appears that the redirection is working fine, the url in question keeps on getting crawled, indexed and cached by google. The redirect was done a month ago, and google shows cached version of it, even for a couple of days ago. Manual checking shows that its being redirected, and also a couple of online tools i checked report a 301 redirect. Do you have any idea why this could be happening? The website I'm talking about is www.hotelmajestic.gr and its being redirected to www.hotel-majestic.gr
Technical SEO | | dim_d0