Robots.txt for subdomain
-
Hi there Mozzers!
I have a subdomain with duplicate content and I'd like to remove these pages from the mighty Google index. The problem is: the website is build in Drupal and this subdomain does not have it's own robots.txt.
So I want to ask you how to disallow and noindex this subdomain. Is it possible to add this to the root robots.txt:
User-agent: *
Disallow: /subdomain.root.nl/User-agent: Googlebot
Noindex: /subdomain.root.nl/Thank you in advance!
Partouter
-
Robots.txt work only for subdomain where it placed.
You need to create separate robots.txt for each sub-domain, Drupal allow this.
it must be located in the root directory of your subdomain Ex: /public_html/subdomain/ and can be accessed at http://subdomain.root.nl/robots.txt.
Add the following lines in the robots.txt file:
User-agent: *
Disallow: /
As alternative way you can use Robots <META> tag on each page, or use redirect to directory root.nl/subdomain and disallow it in main robots.txt. Personally i don't recommend it. -
Not sure how your server is configured but mine is set up so that subdomain.mydomain.com is a subdirectory like this:
http://www.mydomain.com/subdomain/
in robots.txt you would simply need to put
User-agent: *
Disallow: /subdomain/Others may have a better way though.
HTH
Steve
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Subdomain redirect
Hey guys, I was thinking about creating subdomains for one of my websites. I want to divide my website in different subdomains (blog.[site].com / directory.[site].com / etc.) but I'm afraid that this will negatively impact my rankings. My blog for example has a lot of supporting content for my products and services that are primarily hosted on the homepage. Have you guys ever created subdomains at a later stage of your website's existence? What kind of impact did you notice? Would you recommend it? Thanks a million!
Technical SEO | | Nizar.1 -
Clarification regarding robots.txt protocol
Hi,
Technical SEO | | nlogix
I have a website , and having 1000 above url and all the url already got indexed in Google . Now am going to stop all the available services in my website and removed all the landing pages from website. Now only home page available . So i need to remove all the indexed urls from Google . I have already used robots txt protocol for removing url. i guess it is not a good method for adding bulk amount of urls (nearly 1000) in robots.txt . So just wanted to know is there any other method for removing indexed urls.
Please advice.0 -
Robots txt. in page with 301 redirect
We currently have a a series of help pages that we would like to disallow from our robots txt. The thing is that these help pages are located in our old website, which now has a 301 redirect to current site. Which is the proper way to go around? 1- Add the pages we want to disallow to the robots.txt of the new website? 2- Break the redirect momentarily and add the pages to the robots.txt of the old one? Thanks
Technical SEO | | Kilgray0 -
SEO credit for subdomain blogs?
One of my clients is currently running a webstore through Volusions. They would like to add a blog to their website, but since Volusions doesn't currently support blogs on the same domain we would have to create a Wordpress blog and link it to a subdomain. (http://support.volusion.com/article/linking-your-blog-your-volusion-store) Using this method, will their primary website receive any SEO credit for the content being created on the blog or will it only count towards the subdomain? Thanks!
Technical SEO | | CMSSolutions980 -
Best use of robots.txt for "garbage" links from Joomla!
I recently started out on Seomoz and is trying to make some cleanup according to the campaign report i received. One of my biggest gripes is the point of "Dublicate Page Content". Right now im having over 200 pages with dublicate page content. Now.. This is triggerede because Seomoz have snagged up auto generated links from my site. My site has a "send to freind" feature, and every time someone wants to send a article or a product to a friend via email a pop-up appears. Now it seems like the pop-up pages has been snagged by the seomoz spider,however these pages is something i would never want to index in Google. So i just want to get rid of them. Now to my question I guess the best solution is to make a general rule via robots.txt, so that these pages is not indexed and considered by google at all. But, how do i do this? what should my syntax be? A lof of the links looks like this, but has different id numbers according to the product that is being send: http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167 I guess i need a rule that grabs the following and makes google ignore links that contains this: view=send_friend
Technical SEO | | teleman0 -
Subdomains & CDNs
I've set up a CDN to speed up my domain. I've set up a CNAME to map the subdomain cdn.example.com to the URL where the CDN hosts my static content (images, CSS and JS files, and PDFs). www.example.com and cdn.example.com are now two different IP addresses. Internal links to my PDF files (white papers and articles) used to be www.example.com/downloads but now they are cdn.example.com/downloads The same PDF files can be accessed at both the www and the cdn. subdomain. Thus, external links to the www version will continue to work. Question 1: Should I set up 301 redirects in .htaccess such as: Redirect permanent /downloads/filename.pdf http://cdn.example.com/downloads/filename.pdf Question 2: Do I need to do anything else in my .htaccess file (or anywhere else) to ensure that any SEO benefit provided by the PDF files remains associated with my domain? Question 3: Am I better off keeping my PDF files on the www side and off of the CDN? Thanks, Akira
Technical SEO | | ahirai0 -
Same URL in "Duplicate Content" and "Blocked by robots.txt"?
How can the same URL show up in Seomoz Crawl Diagnostics "Most common errors and warnings" in both the "Duplicate Content"-list and the "Blocked by robots.txt"-list? Shouldnt the latter exclude it from the first list?
Technical SEO | | alsvik0 -
Do you get credit for an external link that points to a page that's being blocked by robots.txt
Hi folks, No one, including me seems to actually know what happens!? To repeat: If site A links to /home.html on site B and site B blocks /home.html in Robots.txt, does site B get credit for that link? Does the link pass PageRank? Will Google still crawl through it? Does the domain get some juice, but not the page? I know there's other ways of doing this properly, but it is interesting no?
Technical SEO | | DaveSottimano0