Subdomain Removal in Robots.txt with Conditional Logic??

ErnieB

I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live.

My specific situation is this:

I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content.

Should I:

a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense)

b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning...

Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.

KeriMorgret

Here's how I dealt with a similar situation in the past.

Robots.txt on each of the dev subdomains and on the live domain. Dev subdomains robots.txt excluded the entire subdomain, and subdomains were verified in GWT and removed as needed.

Made live subdomain robots.txt read-only so it didn't get overwritten. Should have made dev subdomains robots.txt read-only as well, since they sometimes got refreshed with the live content (there was a UGC database that would occasionally get copied to a dev subdomain, and we'd have robots.txt get copied over too and dev subdomain indexed).

Set up a code monitor that checks the contents of all of the robots.txt daily and sends me an email if anything is changed.

Not perfect, but I was at least able to catch changes soon after they happened, and prevented a few changes.

irvingw

you can't put logic in robots.txt and subdomains are seen as different sites, so you need to create separate robots.txt files for each subdomain and block them in their respective robots.txt files.

You'll need to also add the Google verification code and verify them, then in GWMT you can request to have the subdomain removed from Googles index, that's the fastest way.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Subdomain Removal in Robots.txt with Conditional Logic??

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Is there a limit to how many URLs you can put in a robots.txt file?

Should we remove category paths for better SEO?

Reverse proxy a successful blog from subdomain to subfolder?

Should 301-ed links be removed from sitemap?

Pages removed from Google index?

Oh no googlebot can not access my robots.txt file

Subdomain and Domain Rankings

Robots.txt and canonical tag

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved