Robots.txt disallow subdomain

Partouter

Hi all,

I have a development subdomain, which gets copied to the live domain. Because I don't want this dev domain to get crawled, I'd like to implement a robots.txt for this domain only. The problem is that I don't want this robots.txt to disallow the live domain. Is there a way to create a robots.txt for this development subdomain only?

Thanks in advance!

oznappies

I would suggest you talk to the developers as Theo suggests to exclude visitors from your test site.

Partouter

The copying is a manual process and I don't want any risks for the live environment. A Httphandler for robots.txt could be a solution and I'm going to discuss this with one of our developers. Other suggestions are still welcome of course!

oznappies

Do you ftp copy one domain to the other? If this is a manual process the excluding the robots.txt that is on the test domain would be as simple as excluding it.

If you automate the copy and want code to function based on base url address then you could create a Httphandler for robots.txt that delivered a different version based on the request url host in the http request header.

Theo-NL

You could use enviromental variables (for example in your env.ini or config.ini file) that are set to DEVELOPMENT, STAGING, or LIVE based on the appropriate environments the code finds itself in.

With the exact same code, your website would either be limiting IP addresses (on the development environment) or allow all IP addresses (in the live environment). With this setup you can also set different variables per environment such as the level of detail that is shown in your error reporting, connect to a testing database rather than a live one, etc.

[this was supposed to be a reply, but I accidentely clicked the wrong button. Hitting 'Delete reply' results in an error.]

Partouter

Thanks for your quick reply, Theo. Unfortunately, this htpasswd will also get copied to the live environment, so our websites will get password protected live. Could there be any other solution for this?

Theo-NL

I'm sure there is, but I'm guessing you don't want any human visitors to go to your development subdomain and view what is being done there as well? I'd suggest you either limit the visitors that have access by IP address (thereby effectively blocking out Google in one move) and/or implement a .htpasswd solution where developers can log in with their credentials to your development area (which blocks out Google as well).

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt disallow subdomain

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

What happens to crawled URLs subsequently blocked by robots.txt?

Robots.txt

Huge increase in server errors and robots.txt

Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)

I have two sitemaps which partly duplicate - one is blocked by robots.txt but can't figure out why!

Follow or nofollow to subdomain

Robots.txt 404 problem

Disallowed Pages Still Showing Up in Google Index. What do we do?