Help with Robots.txt On a Shared Root
-
Hi,
I posted a similar question last week asking about subdomains but a couple of complications have arisen.
Two different websites I am looking after share the same root domain which means that they will have to share the same robots.txt. Does anybody have suggestions to separate the two on the same file without complications? It's a tricky one.
Thank you in advance.
-
Okay so if you have one root domain you can only have one robots.txt file.
The reason I asked for an example is in the case there was something you could put in the robots.txt to differentiate the two.
For example if you have
thisdomain.com and thatdomain.com
However if "thatdomain.com" uses a folder called shop ("thatdomain.com/shop") than you could prefix all your robots.txt file entries with /shop provided that "thisdomain.com" doesn't use the folder shop, Then all the /shop entries would only be applicable to "thatdomain.com". Does this make sense?
Don
-
It's not so much that one is a subdomain, it's that they are as different as Google and Yahoo yet they share the same root. I wish I could show you but I can't because of confidentiality.
The 303 wasn't put in place by me, I would have strongly suggested another method. I think it was set up so that both websites could be controlled from the same login but it's opened a can of worms for SEO.
I don't want the two separate robots files, the developer insists it has to be that way.
-
Can you provide me an example of the way the domains look... Specifically where the root pages are.
Additionally, if you are redirecting 303 one of the domains to the other why do you want two different robots.txt files? The one being 303 will always redirect to the other...?
Depending on the structures you can create one robots.txt file that deals with 2 different domains provided there is something unique about the root folders.
-
Thanks for your help so far.
The two different websites are different name domains but share the same root as it's been built this way on Typo3. I don't know of the developer's justification for the 303, it's something I wish we could change.
I'm not sure if there are specific tags you can put in the sole robots.txt to differentiate the two, have read a few conflicting arguments about how to do it.
-
Okay so if you're using a 303 then you're saying the content you want for X site is actually located at Y site.Which means you do not have 2 different sub domains. So there is no need for 2 robots.txt files and your developer is correct you can't use 2 robots.txt files. Since one site would be pointing to the other you only have one sub-domain.
However, 303 is in general a poor way to use a redirect and likely should be 301.. but I would have to understand why the 303 is being used to say that with 100% certainty. See a quick article about 303 here..
Hope this answers the question,
Don
-
It's Fasthosts. The developer is certain that we can't use the two separate robots files. The second website has been set up on a 303.
-
What host are you using?
-
The developer of the website insists that they have to share the same robots.txt, I am really not sure how he's set it up this way. I am beyond befuddled with this!
-
The subdomain has to be separated from the root in some fashion. I would assume depending on your host that there is a separate folder for the subdomain stuff. Otherwise it would be chaos. Say you installed forums on your forum subdomain and a e-commerce on your shop subdomain... which index.php page would be served?
There has to be some separation, review your file manager and look for the sub-domain folders. Once found you simply put a robots.txt into each of those folders.
Hope this helps,
Don
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Large robots.txt file
We're looking at potentially creating a robots.txt with 1450 lines in it. This will remove 100k+ pages from the crawl that are all old pages (I know, the ideal would be to delete/noindex but not viable unfortunately) Now the issue i'm thinking is that a large robots.txt will either stop the robots.txt from being followed or will slow our crawl rate down. Does anybody have any experience with a robots.txt of that size?
Intermediate & Advanced SEO | | ThomasHarvey0 -
SEO direction - help needed
Hi, I've been working on a site for about 5 years. We built the traffic up to about 8k visitors/day. Although now it's dropped down over the past 2 years to about 2k visitors a day. New traffic source is mainly from SEO longtail. The whole time we have been working to improve the site. What's the best way to get some help from experts on the right direction to get traffic back up or to at least tell me the site will never work 🙂 Thanks in advance. M
Intermediate & Advanced SEO | | relientmark0 -
Robots.txt help
Hi Moz Community, Google is indexing some developer pages from a previous website where I currently work: ddcblog.dev.examplewebsite.com/categories/sub-categories Was wondering how I include these in a robots.txt file so they no longer appear on Google. Can I do it under our homepage GWT account or do I have to have a separate account set up for these URL types? As always, your expertise is greatly appreciated, -Reed
Intermediate & Advanced SEO | | IceIcebaby0 -
Need help with Google Webmaster Tools Errors
I have a lots of error on my Google webmaster tools under Search Appearance -> Structure Data there are two sets of items 1- "hentry" and source is "Markup: microformats.org" and error says: "Missing: author | Missing: updated" 2-"hcard" and source is "Markup: microformats.org" and error says: "Missing: fn" I am using WordPress. Can anybody tell me how to fix these errors please. Thank you Sina
Intermediate & Advanced SEO | | SinaKashani1 -
No admin portal access to website! Help!
While reading the beginners guide, I noticed that to increase my SEO I need to have access to the physical website (ie. to use html rich text/meta tags). I, however, used a third party creative team to build my site, so I have no admin access. Are there any step-by-step instructions of things I can do if I don't have portal access to my website to increase SEO? Please let me know. Thanks..
Intermediate & Advanced SEO | | SmartEnergy.com0 -
Title Tag Help
Hi everyone, So, I have some general question's about Title tags. My question's are as follows: 1. If i have a title tag like this 'Commercial bathroom instillation '. Will I show up for Commercial bathroom or Commercial bathroom instillation? The reason I ask is, i'm aiming for Commercial bathroom which has more search volume, but here is where the problem comes in. If I have Commercial bathroom instillation it is a more compelling title. Ideally i'm aiming for Commercial Bathroom, so im in a bit of a conundrum, as you can see. 2. My second question is if I have 'Bath Review and Shower review' for my title tag. Will I show up for Bath Review individually, and shower review individually, or only when someone search's that exact query? I hope that makes sense thanks. Peter
Intermediate & Advanced SEO | | PeterRota0 -
How long will Google take to read my robots.txt after updating?
I updated www.egrecia.es/robots.txt two weeks ago and I still haven't solved Duplicate Title and Content on the website. The Google SERP doesn't show those urls any more but SEOMOZ Crawl Errors nor Google Webmaster Tools recognize the change. How long will it take?
Intermediate & Advanced SEO | | Tintanus0 -
Old pages still crawled by SE returning 404s. Better to put 301 or block with robots.txt ?
Hello guys, A client of ours has thousand of pages returning 404 visibile on googl webmaster tools. These are all old pages which don't exist anymore but Google keeps on detecting them. These pages belong to sections of the site which don't exist anymore. They are not linked externally and didn't provide much value even when they existed What do u suggest us to do: (a) do nothing (b) redirect all these URL/folders to the homepage through a 301 (c) block these pages through the robots.txt. Are we inappropriately using part of the crawling budget set by Search Engines by not doing anything ? thx
Intermediate & Advanced SEO | | H-FARM0