Subdomain Robots.txt
-
If I have a subdomain (a blog) that is having tags and categories indexed when they should not be, because they are creating duplicate content. Can I block them using a robots.txt file? Can I/do I need to have a separate robots file for my subdomain?
If so, how would I format it? Do I need to specify that it is a subdomain robots file, or will the search engines automatically pick this up?
Thanks!
-
Thanks Wissam. I was thinking this was the way to go, and I appreciate your input.
I do use the Yoast SEO plugin for Wordpress on another site, but the blog in question is through BlogEngine. I will do what you have suggested.
Cheers!
-
if the url is http://blog.website.com
then the Robots.txt should be accessable threw http://blog.website.com/robots.txt
I would suggest these steps
- Verify your blog the Google webmaster tools
- generate a robots .txt file with Google webmaster tools
- Upload it to the Subdomain.
There is another way if you are using Wordpress.
There is a All in One SEO plugin / Wordpress SEO by Yoast. threw the settings you can specify to add NOINDEX to all Category, tags, author and others. its faster and error free.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt vs. meta noindex, follow
Hi guys, I wander what your opinion is concerning exclution via the robots.txt file.
Technical SEO | | AdenaSEO
Do you advise to keep using this? For example: User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/* Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is. Regards,
Tom Vledder0 -
Robots file set up
The robots file looks like it has been set up in a very messy way.
Technical SEO | | mcwork
I understand the # will comment out a line, does this mean the sitemap would
not be picked up?
Disallow: /js/ should this be allowed like /*.js$
Disallow: /media/wysiwyg/ - this seems to be causing alerts in webmaster tools as it can not access
the images within.
Can anyone help me clean this up please #Sitemap: https://examplesite.com/sitemap.xml Crawlers Setup User-agent: *
Crawl-delay: 10 Allowable Index Mind that Allow is not an official standard Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/ Allow: /catalogsearch/result/ Allow: /media/catalog/ Directories Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/ Disallow: /media/ Disallow: /media/captcha/ Disallow: /media/catalog/ #Disallow: /media/css/
#Disallow: /media/css_secure/
Disallow: /media/customer/
Disallow: /media/dhl/
Disallow: /media/downloadable/
Disallow: /media/import/
#Disallow: /media/js/
Disallow: /media/pdf/
Disallow: /media/sales/
Disallow: /media/tmp/
Disallow: /media/wysiwyg/
Disallow: /media/xmlconnect/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
#Disallow: /skin/
Disallow: /stats/
Disallow: /var/ Paths (clean URLs) Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalog/product/gallery/
Disallow: */catalog/product/upload/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/ Files Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /get.php # Magento 1.5+ Paths (no clean URLs) #Disallow: /.js$
#Disallow: /.css$
Disallow: /.php$
Disallow: /?SID=
Disallow: /rss*
Disallow: /*PHPSESSID Disallow: /:
Disallow: /😘 User-agent: Fatbot
Disallow: / User-agent: TwengaBot-2.0
Disallow: /0 -
Removed Subdomain Sites Still in Google Index
Hey guys, I've got kind of a strange situation going on and I can't seem to find it addressed anywhere. I have a site that at one point had several development sites set up at subdomains. Those sites have since launched on their own domains, but the subdomain sites are still showing up in the Google index. However, if you look at the cached version of pages on these non-existent subdomains, it lists the NEW url, not the dev one in the little blurb that says "This is Google's cached version of www.correcturl.com." Clearly Google recognizes that the content resides at the new location, so how come the old pages are still in the index? Attempting to visit one of them gives a "Server Not Found" error, so they are definitely gone. This is happening to a couple of sites, one that was launched over a year ago so it doesn't appear to be a "wait and see" solution. Any suggestions would be a huge help. Thanks!!
Technical SEO | | SarahLK0 -
Should I blow up our subdomain?
Hey folks, Several years ago we created a couple subdomains (ie, NEWS.URL.COM) and the posts that we put on this subdomain were very full of keyword anchor text links. Each post sometimes had 4-5. We haven't posted any new content on this subdomain for 3 years. After getting hit with manual "linking" penalty, we disavowed tons of links. But left the links on the subdomains alone. They really aren't providing any traffic and the content is poor, and not written by me. Do you think having these links is hindering our ranking efforts? I think I should just blow up this subdomain and get rid of it and all the keyword anchor links. Thoughts? Thanks ron Thoughts?
Technical SEO | | yatesandcojewelers0 -
How many times robots.txt gets visited by crawlers, especially Google?
Hi, Do you know if there's any way to track how often robots.txt file has been crawled? I know we can check when is the latest downloaded from webmaster tool, but I actually want to know if they download every time crawlers visit any page on the site (e.g. hundreds of thousands of times every day), or less. thanks...
Technical SEO | | linklater0 -
Competitive Domain Analysis & Subdomain Metrics
I have a web site that shows with all zeroes in the Competitive Domain Analysis and subdomain Metrics screen. I don't think this is possible because I have a ton of links out there to this web site from a forum that I visit. Can someone help me understand how this might be. I am hoping it's not some dreaded www vs non www issue because I think I solved that issue for this site. The site is www.nationalcurrencyvalues.com
Technical SEO | | Banknotes0 -
Sub pages Vs subdomain Pagerank flow.
I Have Question about Pagerank flow:
Technical SEO | | tommytai
If ihave a site :domain.com and i have 2 solutions like:
Solution #1: Quote: | domain.com/blog and domain.com/video if i try to do :
Root domain only link to
domain.com/video and
domain.com/blog | Solution #2: Quote: | Root domain only link to
video.domain.com and
blog.domain.com | So <acronym title="Google PageRank">Pr</acronym> domain.com/blog = <acronym title="Google PageRank">Pr</acronym> blog.domain.com ?
and <acronym title="Google PageRank">PR</acronym> domain.com/video = <acronym title="Google PageRank">Pr</acronym> video.domain.com ? And don't know why a subdomain of blogspot or Wordpress ranking easier than a new domain like:keyword.wordpress.com and keyword.com So What Wordpress pass to keyword.wordpress.com ?0 -
SeoMoz robot is not able to crawl my website.
Hi, SeoMoz robot crawls only two web pages of my website. I contacts seomoz team and they told me that the problem is because of Javascript use. What is the solution to this? Should I contact my webdesign company and ask them to remove Javascript code?
Technical SEO | | ashish2110