Robots.txt for subdomain
-
Hi there Mozzers!
I have a subdomain with duplicate content and I'd like to remove these pages from the mighty Google index. The problem is: the website is build in Drupal and this subdomain does not have it's own robots.txt.
So I want to ask you how to disallow and noindex this subdomain. Is it possible to add this to the root robots.txt:
User-agent: *
Disallow: /subdomain.root.nl/User-agent: Googlebot
Noindex: /subdomain.root.nl/Thank you in advance!
Partouter
-
Robots.txt work only for subdomain where it placed.
You need to create separate robots.txt for each sub-domain, Drupal allow this.
it must be located in the root directory of your subdomain Ex: /public_html/subdomain/ and can be accessed at http://subdomain.root.nl/robots.txt.
Add the following lines in the robots.txt file:
User-agent: *
Disallow: /
As alternative way you can use Robots <META> tag on each page, or use redirect to directory root.nl/subdomain and disallow it in main robots.txt. Personally i don't recommend it. -
Not sure how your server is configured but mine is set up so that subdomain.mydomain.com is a subdirectory like this:
http://www.mydomain.com/subdomain/
in robots.txt you would simply need to put
User-agent: *
Disallow: /subdomain/Others may have a better way though.
HTH
Steve
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Link from Blogspot.com subdomain...
I have found an author that has an article about a particular product we sell online. I was thinking of speaking to them about getting a link to our site. But then I looked at the stats: <label>Page:</label><label id="Page Authority" class="key lsdata">PA:1</label><label id="mozrank" class="key lsdata" title="MozRank">mR:0.00</label>mT:0.00<label id="SEOmoz-data-uid">0</label> links from
Technical SEO | | bjs2010
<label id="SEOmoz-data-uipl">0</label> Root Domains<label>Root Domain:</label>**<label id="dom-pageauthority" class="key lsdata" title="Domain Authority">DA: 59</label>**24,797,212 links from
<label id="SEOmoz-data-pid">110,858</label> Domains<label>Subdomain:</label>Its on a subdomain of blogspot.com - and the page is relevant to a particular category and our e-commerce site.Is it worth pursuing the link?Thanks!0 -
Timely use of robots.txt and meta noindex
Hi, I have been checking every possible resources for content removal, but I am still unsure on how to remove already indexed contents. When I use robots.txt alone, the urls will remain in the index, however no crawling budget is wasted on them, But still, e.g having 100,000+ completely identical login pages within the omitted results, might not mean anything good. When I use meta noindex alone, I keep my index clean, but also keep Googlebot busy with indexing these no-value pages. When I use robots.txt and meta noindex together for existing content, then I suggest Google, that please ignore my content, but at the same time, I restrict him from crawling the noindex tag. Robots.txt and url removal together still not a good solution, as I have failed to remove directories this way. It seems, that only exact urls could be removed like this. I need a clear solution, which solves both issues (index and crawling). What I try to do now, is the following: I remove these directories (one at a time to test the theory) from the robots.txt file, and at the same time, I add the meta noindex tag to all these pages within the directory. The indexed pages should start decreasing (while useless page crawling increasing), and once the number of these indexed pages are low or none, then I would put the directory back to robots.txt and keep the noindex on all of the pages within this directory. Can this work the way I imagine, or do you have a better way of doing so? Thank you in advance for all your help.
Technical SEO | | Dilbak0 -
Subdomains Issue
Hi , We have created sub domains of our site to target various Geo´s. For example, geo, uk.site.com, de.site,com and all these sub domains have the same content as main domain. Will it affect our SEO Rankings? How can we solve this if it affects our rankings?
Technical SEO | | mikerbrt240 -
RegEx help needed for robots.txt potential conflict
I've created a robots.txt file for a new Magento install and used an existing site-map that was on the Magento help forums but the trouble is I can't decipher something. It seems that I am allowing and disallowing access to the same expression for pagination. My robots.txt file (and a lot of other Magento site-maps it seems) includes both: Allow: /*?p= and Disallow: /?p=& I've searched for help on RegEx and I can't see what "&" does but it seems to me that I'm allowing crawler access to all pagination URLs, but then possibly disallowing access to all pagination URLs that include anything other than just the page number? I've looked at several resources and there is practically no reference to what "&" does... Can anyone shed any light on this, to ensure I am allowing suitable access to a shop? Thanks in advance for any assistance
Technical SEO | | MSTJames0 -
Best geotargeting strategy: Subdomains or subfolders or country specific domain
How have the relatively recent changes in how G perceives subdomains changed the best route to onsite geotargeting i.e. not building out new country specific sites on country specific and hosted domains and instead developing sub-domains or sub-folders and geo-targeting those via webmaster tools ? In other words, given the recent change in G perception, are sub-domains now a better option than a sub-folder or is there not much in it ? Also if client has a .co.uk and they want to geo-target say France, is the sub-domain/sub-folder route still an option or is the .co.uk still too UK specific, and these options would only work using a .com ? In other words can sites on country specific domains (.co.uk , .fr, .de etc etc) use sub-folders or domains to geo-target other countries or do they have no option other than to develop new country specific (domains/hosting/language) websites ? Any thoughts regarding current best practice in this regard much appreciated. I have seen last Febs WBF which covers geotargeting in depth but the way google perceives subdomains has changed since then Many Thanks Dan
Technical SEO | | Dan-Lawrence0 -
Robots exclusion
Hi All, I have an issue whereby print versions of my articles are being flagged up as "duplicate" content / page titles. In order to get around this, I feel that the easiest way is to just add them to my robots.txt document with a disallow. Here is my URL make up: Normal article: www.mysite.com/displayarticle=12345 Print version of my article www.mysite.com/displayarticle=12345&printversion=yes I know that having dynamic parameters in my URL is not best practise to say the least, but I'm stuck with this for the time being... My question is, how do I add just the print versions of articles to my robots file without disallowing articles too? Can I just add the parameter to the document like so? Disallow: &printversion=yes I also know that I can do add a meta noindex, nofollow tag into the head of my print versions, but I feel a robots.txt disallow will be somewhat easier... Many thanks in advance. Matt
Technical SEO | | Horizon0 -
Robots.txt
should I add anything else besides User-Agent: * to my robots.txt file? http://melo4.melotec.com:4010/
Technical SEO | | Romancing0