Robots.txt blocking Addon Domains
-
I have this site as my primary domain: http://www.libertyresourcedirectory.com/
I don't want to give spiders access to the site at all so I tried to do a simple Disallow: / in the robots.txt. As a test I tried to crawl it with Screaming Frog afterwards and it didn't do anything. (Excellent.)
However, there's a problem. In GWT, I got an alert that Google couldn't crawl ANY of my sites because of robots.txt issues. Changing the robots.txt on my primary domain, changed it for ALL my addon domains. (Ex. http://ethanglover.biz/ ) From a directory point of view, this makes sense, from a spider point of view, it doesn't.
As a solution, I changed the robots.txt file back and added a robots meta tag to the primary domain. (noindex, nofollow). But this doesn't seem to be having any effect. As I understand it, the robots.txt takes priority.
How can I separate all this out to allow domains to have different rules? I've tried uploading a separate robots.txt to the addon domain folders, but it's completely ignored. Even going to ethanglover.biz/robots.txt gave me the primary domain version of the file. (SERIOUSLY! I've tested this 100 times in many ways.)
Has anyone experienced this? Am I in the twilight zone? Any known fixes? Thanks.
Proof I'm not crazy in attached video.
-
Sort of resolved, maybe the wrong place to ask any further. The above is a working fix for what seems like a legit bug, I'll update if WordPress forums say anything.
-
No, I don't like to waste memory and bandwidth. If you can do it yourself, you should probably do it yourself. I'm moving this question to WordPress.
-
Hi Ethan
One thing I have heard of people trying is a plugin that serves dynamic robots.txt files. I don't use add-on sites so you will probably have to test the behavior. He is an example of one of the plugins.
https://wordpress.org/plugins/wp-robots-txt/
hope this helps,
Anthony -
Ethan
It sounds like the issue has been resolved. I'm not too familiar with domain add-ons but if you have any more trouble let us know and I'll be sure another Moz Associate takes a look.
-Dan (Moz Associate)
-
-
Hi Ethan
Sorry, I wasn't clear. I was thinking you could drop the use of the robots.txt all together and just use the Meta Tag approach since it seems that the robots.txt is having a global impact to your sites. Search engines will still crawl the pages, but it should exclude them from the index.
Hope this helps,
Anthony -
Anthony, based on your response it's obvious you haven't read the question or follow-up.
-
Hi Ethan
One approach may be to try using the Robots Meta Tag. You can use noindex to tell Google not to index. This won't prevent crawling, but Google should respect the request to not index your site. I have included a good guide below to get you started.
https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag
Hope this helps,
Anthony B
Biondo Creative
biondocreative.com -
I've found a quick fix for now: http://ethanglover.biz/using-robots-txt-with-addon-domains/
This is still an issue, and it may be exclusive to WordPress.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking subdomains with Robots.txt file
We noticed that Google is indexing our pre-production site ibweb.prod.interstatebatteries.com in addition to indexing our main site interstatebatteries.com. Can you all help shed some light on the proper way to no-index our pre-prod site without impacting our live site?
Technical SEO | | paulwatley0 -
How to redirect old domain to new domain.
We just recently signed up to Moz with hopes of fixing our Moz Ranking. We have an old domain - http://at-net.net and a new domain - https://www.expertip.net We have set up 301 (Permanent) redirects from all pages on the old site to the new, but aren't getting the ranking or aren't getting recognized from external links to the old sites. I've read the moz article on 'Link Juice' and followed those practices, but it doesn't seem to help. Does anyone have advice on doing this? Thanks in advance,
Technical SEO | | greg.lanier
Greg0 -
How to prevent duplicat content issue and indexing sub domain [ CDN sub domain]?
Hello! I wish to use CDN server to optimize my page loading time ( MaxCDN). I have to use a custom CDN sub domain to use these services. If I added a sub domain, then my blog has two URL (http://www.example.com and http://cdn.example.com) for the same content. I have more than 450 blog posts. I think it will cause duplicate content issues. In this situation, what is the best method (rel=canonical or no-indexing) to prevent duplicate content issue and prevent indexing sub domain? And take the optimum service of the CDN. Thanks!
Technical SEO | | Godad0 -
Site blocked by robots.txt and 301 redirected still in SERPs
I have a vanity URL domain that 301 redirects to my main site. That domain does have a robots.txt to disallow the entire site as well. However, for a branded enough search that vanity domain still shows up in SERPs and has the new Google message of: A description for this result is not available because of this site's robots.txt I get why the message is there - that's not my , my question is shouldn't a 301 redirect trump this domain showing in SERPs, ever? Client isn't happy about it showing at all. How can I get the vanity domain out of the SERPs? THANKS in advance!
Technical SEO | | VMLYRDiscoverability0 -
Migrating a better performing domain to a less well performing domain
I have a customer who has many domain names and assets but she's wanting to consolidate some of them to help her simplify things for her customers but mostly she wants to build up her website through which she sells products. Grief Reflection - www.griefreflection.com is a personal journal that she's keeping to process the impending death of her husband and it's also linked to her business website which sells healing from grief types of products. Storybooks for Healing - www.storybooksforhealing.com is the website through which she sells workbooks and memory books for people who want to keep the memory of their loved one alive after they've gone. I've setup both of these domains as campaigns and have been looking at the metrics for both. The grief reflection blog out performs the storybooks for healing website. If we merge the two then the Grief Reflection blog would likely become a subdirectory under www.storybooksforhealing.com and be more fully integrated which she thinks will help her visitors not get confused while navigating her website. www.griefreflection.com has 12,637 links while www.storybooksforhealing.com has 1,462. Also, Google has indexed 380 pages of Grief Reflection and only 100 pages for Storybooks for Healing, though that may be because there are fewer pages to index. Grief reflection also has a 4.36 mozRank and 5.30 mozTrust, where Storybooks has 4.13 mozRank and 5.15 mozTrust. Should I counsel her to keep these domains separate? If not, would simply setting up 301 redirects from the www.griefreflection.com domain name to the new subdirectory under www.storybooksforhealing.com be the way to go? Thank you ever so much for any wisdom anyone can provide.
Technical SEO | | ChristiMc0 -
Domain Masking with New Keyword-Rich Domains
Hello, friends. We have an ecommerce site and we also own several keyword-rich domains but haven't done anything with them yet. Is there any value in using domain masking to point them to either product pages or special landing pages on our primary ecommerce site? Here's an example: Primary site is widgetzone.com Keyword rich URL is acmewidget.com (which is totally blank and isn't indexed) It could point to our category page for Acme Widgets: widgetzone.com/category/acme-widgets or it could point to a new landing page: widgetzone.com/acme-widgets My concern is that because the keyword-rich URL hasn't been utilized at all there's really no point in redirecting it. I'm of the mind that it's either going to be ineffective at best or a duplicate content issue at worst. What do you guys think? As a follow-up, if we don't redirect these domains, what should we do with them? Just try to sell them off rather than create totally new sites?
Technical SEO | | jbreeden0 -
Domain Forwarding and SEO
I have looked around and only saw older and contradicting responses to this question but what effect does having a domain with VALUABLE-KEYWORD.com forward to MAINSITE.com or COMMON-MISSPELLING.com forward to MAINSITE.com in terms of SEO and is it considered spammy or looked down upon
Technical SEO | | treytt0 -
How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
Today's sitemap webinar made me think about the disallow feature, seems opposite of sitemaps, but it also seems both are kind of ignored in varying ways by the engines. I don't need help semantically, I got that part. I just can't seem to find a contemporary answer about what should be blocked using the robots.txt file. For example, I have folders containing site comps for clients that I really don't want showing up in the SERPS. Is it better to not have these folders on the domain at all? There are also security issues I've heard of that make sense, simply look at a site's robots file to see what they are hiding. It makes it easier to hunt for files when they know the directory the files are contained in. Do I concern myself with this? Another example is a folder I have for my xml sitemap generator. I imagine google isn't going to try to index this or count it as content, so do I need to add folders like this to the disallow list?
Technical SEO | | SpringMountain0