Block an entire subdomain with robots.txt?
-
Is it possible to block an entire subdomain with robots.txt?
I write for a blog that has their root domain as well as a subdomain pointing to the exact same IP. Getting rid of the option is not an option so I'd like to explore other options to avoid duplicate content. Any ideas?
-
Awesome! That did the trick -- thanks for your help. The site is no longer listed
-
Fact is, the robots file alone will never work (the link has a good explanation why - short form: all it does is stop the bots from indexing again).
Best to request removal then wait a few days.
-
Yeah. As of yet, the site has not been de-indexed. We placed the conditional rule in htaccess and are getting different robots.txt files for the domain and subdomain -- so that works. But I've never done this before so I don't know how long it's supposed to take?
I'll try to verify via Webmaster Tools to speed up the process. Thanks
-
You should do a remove request in Google Webmaster Tools. You have to first verify the sub-domain then request the removal.
See this post on why the robots file alone won't work...
http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts
-
Awesome. We used your second idea and so far it looks like it is working exactly how we want. Thanks for the idea.
Will report back to confirm that the subdomain has been de-indexed.
-
Option 1 could come with a small performance hit if you have a lot of txt files being used on the server.
There shouldn't be any negative side effects to option 2 if the rewrite is clean (IE not accidently a redirect) and the content of the two files are robots compliant.
Good luck
-
Thanks for the suggestion. I'll definitely have to do a bit more research into this one to make sure that it doesn't have any negative side effects before implementation
-
We have a plugin right now that places canonical tags, but unfortunately, the canonical for the subdomain points to the subdomain. I'll look around to see if I can tweak the settings
-
Sounds like (from other discussions) you may be stuck requiring a dynamic robot.txt file which detects what domain the bot is on and changes the content accordingly. This means the server has to run all .txt file as (I presume) PHP.
Or, you could conditionally rewrite the /robot.txt URL to a new file according to sub-domain
RewriteEngine on
RewriteCond %{HTTP_HOST} ^subdomain.website.com$
RewriteRule ^robotx.txt$ robots-subdomain.txtThen add:
User-agent: *
Disallow: /to the robots-subdomain.txt file
(untested)
-
Placing canonical tags isn't an option? Detect that the page is being viewed through the subdomain, and if so, write the canonical tag on the page back to the root domain?
Or, just place a canonical tag on every page pointing back to the root domain (so the subdomain and root domain pages would both have them). Apparently, it's ok to have a canonical tag on a page pointing to itself. I haven't tried this, but if Matt Cutts says it's ok...
-
Hey Ryan,
I wasn't directly involved with the decision to create the subdomain, but I'm told that it is necessary to create in order to bypass certain elements that were affecting the root domain.
Nevertheless, it is a blog and the users now need to login to the subdomain in order to access the Wordpress backend to bypass those elements. Traffic for the site still goes to the root domain.
-
They both point to the same location on the server? So there's not a different folder for the subdomain?
If that's the case then I suggest adding a rule to your htaccess file to 301 the subdomain back to the main domain in exactly the same way people redirect from non-www to www or vice-versa. However, you should ask why the server is configured to have a duplicate subdomain? You might just edit your apache settings to get rid of that subdomain (usually done through a cpanel interface).
Here is what your htaccess might look like:
<ifmodule mod_rewrite.c="">RewriteEngine on
# Redirect non-www to wwww
RewriteCond %{HTTP_HOST} !^www.mydomain.org [NC]
RewriteRule ^(.*)$ http://www.mydomain.org/$1 [R=301,L]</ifmodule> -
Not to me LOL
I think you'll need someone with a bit more expertise in this area than I to assist in this case. Kyle, I'm sorry I couldn't offer more assistance... but I don't want to tell you something if I'm not 100% sure. I suspect one of the many bright SEOmozer's will quickly come to the rescue on this one.
Andy
-
Hey Andy,
Herein lies the problem. Since the domain and subdomain point to the exact same place, they both utilize the same robots.txt file.
Does that make sense?
-
Hi Kyle
Yes, you can block an entire subdomain via robots.txt, however you'll need to create a robots.txt file and place it in the root of the subdomain, then add the code to direct the bots to stay away from the entire subdomain's content.
User-agent: *
Disallow: /hope this helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Short description about our search results drop + forum moving to subdomain question.
Hello, here is our story. Our niche is mental health (psychology, psychotherapy e.t.c). Our portal has thousand of genuine articles, news section about mental health, researches, job findings for specialists, a specialized bookstore only with psychology books, the best forum in country, we thousands of active members and selfhelp topics etc. In our country (non english), our portal has been established in 2003. Since then, for more than 15 years, we were no 1 in our country, meaning that we had the best brand name, hundreds of external authors writing unique content for our portal and hundreds of no1 keywords in google search results. Actually, we had according to webmaster tools, more than 1.000 keywords, in 1 and 2 position. (we were ranking no1 in all the best keywords). Before 2 years, we purchased the best domain in our niche. I ll use the below example (of course, domains are not the real ones):
Intermediate & Advanced SEO | | dodoni
We had: e-pizza.com and now we have: pizza.com
We did the appropriate redirects but from day one, we had around 20-30% drop in search engines. After 6 months -which is something that google officialy mentions, we lost all "credits from the old domain.. .and at that point, we had another 20-30% drop in search results. Further more, in any google core update, we were keep dropping. Especially in last May (coronovirus update), we had another huge drop. We do follow seo guides, we have a dedicated server, good load speed, well structured data, amp, a great presence in social media, with more than 130.000 followers, etc. According to our investigation, we came to one only conclusion: that our forum, kills our seo (of course, noone in our team can guarantee that this is the actual reason of the uge drop in may-in coronovirus google core update). We believe that the forum kills our seo, because it produces low quality posts by members. For example, psychopharmacology in a very active sections and we believe, google is very "sensitive" in these kind of posts and information. So here is the question: although the forum is very very active, with thousands of new topics and posts every month, we are thinking of moving it to a subdomain, from the subfolder that now is.
This will help our domain authority to increase from 38 that is stuck 2 years now, to larger scales. We believe that althougth this forum gave a great boost to the portal, in the past 10-15 years, it somehow makes a negative impact now. If I could give more spesific details, I d say this: in all seo tools we run, the best kewwords bringing visitors to us, arent anymore, psychology and psychotherapy and mental health and this kind of top-keywords, but are mostly the ones from the forum, like: I want to proceed with a suicide, I m taking efexor or xanax and they have side effects, why i gain wieght with the antidepressants I get etc. 1. Moving our forum to subdomain, will be some kind of pain, since it is a large community, with thousands of backlinks that we somehow must handle in a proper way, also with a mobile application, things that will have to change and probably have some kind of negative impact. Would that be according to your knowledge a correct move and our E-A-T will benefit for google, or since google will know that the subdomain is still part of the same website/portal, it will handle it somehow, the same way as it does now? I have read hundreds of articles about forum in subdomains or in subfolders, but none of them covers a case stydy like ours, since most articles are talking about new forums and what is the best way to handle them and where is the best place to create them (in subfolder of subdomain) when from scratch. Looking forward to your answers.0 -
Robots.txt - blocking JavaScript and CSS, best practice for Magento
Hi Mozzers, I'm looking for some feedback regarding best practices for setting up Robots.txt file in Magento. I'm concerned we are blocking bots from crawling essential information for page rank. My main concern comes with blocking JavaScript and CSS, are you supposed to block JavaScript and CSS or not? You can view our robots.txt file here Thanks, Blake
Intermediate & Advanced SEO | | LeapOfBelief0 -
Which one is better, a brand new subdomain or a second-level directory with PR 4
Hey, all SEOers! May I ask you a question about subdomain and second-level directory? Our website is about software, so we write many posts about how to use this software solve problems, and then use these posts to get ranks (we don't use the page of software to get ranks). And all the posts we wrote are listed under the second-lever directory, just like: www.xxx.com/support/ . But at this moment our boss want to list all the posts to the subdomain like support.xxx.com. By the way, the second-level directory is a page with PR 4, and the subdomain is brand new, even it doesn't exist now. So here is my question: should we list all the posts to support.xxx.com? If we choose to do like this, this will effect the speed of Google index, and we will take more time to build links for XXX.com and support.XXX.com? Any answer will be appreciated and thank you advance! to get rank instead of ranking the page of product,
Intermediate & Advanced SEO | | Vicky28850 -
Robots.txt
Hi all, Happy New Year! I want to block certain pages on our site as they are being flagged (according to my Moz Crawl Report) as duplicate content when in fact that isn't strictly true, it is more to do with the problems faced when using a CMS system... Here are some examples of the pages I want to block and underneath will be what I believe to be the correct robots.txt entry... http://www.XYZ.com/forum/index.php?app=core&module=search&do=viewNewContent&search_app=members&search_app_filters[forums][searchInKey]=&period=today&userMode=&followedItemsOnly= Disallow: /forum/index.php?app=core&module=search http://www.XYZ.com/forum/index.php?app=core&module=reports&rcom=gallery&imageId=980&ctyp=image Disallow: /forum/index.php?app=core&module=reports http://www.XYZ.com/forum/index.php?app=forums&module=post§ion=post&do=reply_post&f=146&t=741&qpid=13308 Disallow: /forum/index.php?app=forums&module=post http://www.XYZ.com/forum/gallery/sizes/182-promenade/small/ http://www.XYZ.com/forum/gallery/sizes/182-promenade/large/ Disallow: /forum/gallery/sizes/ Any help \ advice would be much appreciated. Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
What's better ...more or less linking C-blocks?
I'm a little confused about c-blocks, I've been reading about them but I still don't get it. Are these similar to sitewide links? do they have to come from websites that I own and hosted in the same ip? and finally, what's better ...more or less linking c-blocks? Cheers 🙂
Intermediate & Advanced SEO | | mbulox0 -
Robots.txt file - How to block thosands of pages when you don't have a folder path
Hello.
Intermediate & Advanced SEO | | Unity
Just wondering if anyone has come across this and can tell me if it worked or not. Goal:
To block review pages Challenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236 So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only. Question:
If I add the following to the Robots.txt file will it block all review pages? User-agent: *
Disallow: /default.aspx?z=review Much thanks,
Davinia0 -
Will blocking google and SE's from indexing images hurt SEO?
Hi, We have a bit of a problem where on a website we are managing, there are thousands of "Dynamically" re-sized images. These are stressing out the server as on any page there could be upto 100 dynamically re-sized images. Google alone is indexing 50,000 pages a day, so multiply that by the number of images and it is a huge drag on the server. I was wondering if it maybe an idea to blog Robots (in robots.txt) from indexing all the images in the image file, to reduce the server load until we have a proper fix in place. We don't get any real value from having our website images in "Google Images" so I am wondering if this could be a safe way of reducing server load? Are there any other potential SEO issues this could cause?? Thanks
Intermediate & Advanced SEO | | James770 -
Should subdomains be avoided for brand new websites?
When creating a brand new website, will setting it up as a subdomain provide ranking benefits? I understand that if it's an existing domain, it's better to use a subfolder because a subdomain is treated as a different domain. But is there any reason not to start a website with the keyword in the subdomain? For example: keyword.domain.com The SERP's are dominated by websites which contain some variation of the head term, but the disadvantage of doing a similar this is your website looks very similar. Thanks!
Intermediate & Advanced SEO | | JonDavies540