How does robots.txt affect aliased domains?

michaelj_me

Several of my sites are aliased (hosted in subdirectories off the root domain on a single hosting account, but visible at www.theSubDirectorySite.com) Not ideal, I know, but that's a different issue.

I want to block bots from viewing those files that are accessible in subdirectories on the main hosting account, www.RootDomain.com/SubDirectorySite/, and force the bots to look at www.SubDirectorySite.com instead.

I utilized the canonical meta tag to point bots away from the sub directory site, but I am wondering what will happen if I use robots.txt to block those files from within the root domain.

Will the bots, specifically Google bot, still index the site at its own URL, www.AnotherSite.com even if I've blocked that directory with Disallow: /AnotherSite/ ?

THANK YOU!!!

Dr-Pete

I'm assuming you can't 301-redirect (and that you still need the sub-directory versions to be reachable by humans)? I'm not sure the cross-domain canonical will work completely. I don't have a good example of a sub-folder to root domain canonical implementation. If the "sites" are identical, it should be ok.

Robots.txt is going to depend a bit on how people access those. If there are links to the sub-directory versions, then blocking will cut off that link-juice (and the canonical or a 301 will be better).

Blocking the sub-directory shouldn't automatically block the domain it aliases, too, unless for some reason that sub-directory is the only crawl path Google has to the outside domain. As long as they're crawling the outside domain separately, I think you'll be ok. I'm just not sure if Robots.txt is necessary here.

Sorry, the devil may be in the details on this one. Happy to take a closer look in Private Q&A, if you want to give out some specifics.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How does robots.txt affect aliased domains?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Moving Shopify from a Sub Domain to the Full Domain

Adding your sitemap to robots.txt

Robots.txt Syntax for Dynamic URLs

I accidentally blocked Google with Robots.txt. What next?

What is the advantage of using sub domains instead of pages on the root domain?

Confused about robots.txt

Should I set up a disallow in the robots.txt for catalog search results?

Is robots.txt a must-have for 150 page well-structured site?