How do we handle sitemaps in robots.txt when multiple domains point to same physical location?
-
we have www.mysite.net, www.mysite.se, www.mysite.fi and so on. all of these domains point to the same physical location on our webserver, and we replace texts given back to client depending on which domain he/she requested.
My problem is this: How do i configure sitemaps in robots.txt when robots.txt is used by multiple domains? If I for instance put the rows
Sitemap: http://www.mysite.net/sitemapNet.xml
Sitemap: http://www.mysite.net/sitemapSe.xmlin robots.txt, would that result in some cross submission error?
-
Thanks for your help René!
-
yup
-
Yes, I mean GTW of course :).
A folder for each site would definitely make some things easier, but it would also mean more work every time we need to republish the site or make configurations.
Did I understand that googlelink correctly in that if we have verified ownership in GWT for all involved domains cross-site submission in robots.txt was okay? I guess google will think its okay anyway.
-
actually google has the answer, right here: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=75712
I always try to do what google recommends even though something might work just as well.. just to be on the safe side
-
you can't submit a sitemap in GA so I'm guessing you mean GWT
Whether or not you put it in the robots.txt shouldn't be a problem. since in each sitemap, the urls would look something like this:
Sitemap 1:<url><loc>http:/yoursite.coim/somepage.html</loc></url>
Sitemap 2:<url><loc>http:/yoursite.dk/somepage.html</loc></url>
I see no need to filter what sitemap is shown to the crawler. If your .htaccess is set-up to redirect traffic from the TLD (top level domain eg .dk .com ex.) to the correct pages. Then the sitemaps shouldn't be a problem.
The best solution would be: to have a web in web. (a folder for each site on the server) and then have the htaccess redirect to the right folder. in this folder you have a robots.txt and a sitemap for that specific site. that way all your problems will be gone in a jiffy. It will be just like managing different 3 sites. even though it isn't.
I am no ninja with .htaccess files but I understand the technology behind it and know what you can do in them. for a how to do it guide, ask google thats what I allways do when I need to goof around in the htaccess. I hope it made sense.
-
Thanks for your response René!
Thing is we already submit the sitemaps in google analytics, but this SEO company we hired wants us to put the sitemaps in robots.txt as well.
The .htaccess idea sounds good, as long as google or someone else dont think we are doing some cross-site submission error (as described here http://www.sitemaps.org/protocol.php#submit_robots)
-
I see no need to use robots.txt for that. use Google and Bings webmaster tools. Here you have each domain registered and can submit sitemaps to them for each domain.
If you want to make sure that your sitemaps are not crawled by a bot for a wrong language. I would set it up in the .htaccess to test for the entrance domain and make sure to redirect to the right file. Any bot will enter a site just like a browser so it needs to obey the server. so if the server tells it to go somewhere it will.
the robots.txt can't by it self, do what you want. The server can however. But in my opinion using bing and google webmaster tools should do the trick.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Any recommendations for an XML Sitemap for a large community website?
Hi all, Once of our clients is a large community website for parents/parenting. The standard Wordpress XML Sitemap plugin is throwing up lots of errors, etc, and is not ideal. Does anyone have any recommendations for either a tool that we could use to create a better one, or else a service that we could pay to use? Gavin
On-Page Optimization | | IcanAgency0 -
Agency Domain Authority Boosting Activity
Hi Guys Have been reading up a bit on methods for boosting Domain Authority and am generally finding that the best way is by producing unique and relevant content through blogs and other kinds of articles. Having multiple clients in an agency means that there is limited time for this and I need something else to assist in boosting Domain Authority. I perform a fair bit of backlinking through online directories, however I am also finding that most blog comment sections have implemented 'no follow' codes to reduce spam content. There are plenty of free online directories, and many with high Domain Authorities, however they can take up to months for the listings to be approved. I am performing other activities to boost keyword rankings in Google for our clients but need some help with getting their Domain Authority up. Does anyone know of an efficient method for boosting Domain Authority for an agency with many clients where blog writing for each may not be a viable option? Would be great to hear anyone's ideas!
On-Page Optimization | | JuiceBoxOM0 -
How will it effect SEO to have multiple h1 tags on a page?
I have a client who recieved this advice from his marketing consultant: "If there are multiple h1 tags on a page, this can confuse Google and it may have a negative impact on the keyword rankings. If you could ask your web developer to go in and remove the h1 tags on the header images that would be helpful. This way it will be easier for Google to index your site and will help your keyword rankings." How will it effect SEO to have multiple h1 tags on a page?
On-Page Optimization | | GRIP-SEO0 -
Domain sub-directory not performing
We've restructured our site over the past 6 months and I'm going to run you through the whole scenario as I would love any feedback you guys have. 6 months ago we had 2 websites http://boulders-climbing.com (climbing facility) and http://bouldersuk.com (shop). We made the decision to merge the websites and leverage the SEO on 1 site. Although boulders-climbing.com was the older and more established domain, the company wanted to use bouldersuk.com so a whole new website was designed and boulders-climbing.com was redirected to bouldersuk.com. The climbing facility website now sits at bouldersuk.com and the shop was moved to bouldersuk.com/climbing-shop with 301's for all shop pages. This has lead to a significant a increase in domain rank for bouldersuk.com and much better rankings for the climbing centre related terms. The desired effect has been achieved, well half of it anyway. The search rankings for bouldersuk.com/climbing-shop have never reached the previous heights and are still heading in the wrong direction, even though the overall domain ranking has increased by 50%. What can I do to get the SEO for /climbing-shop working again? We're adding fresh content to our latest news with links through to products and categories, all category pages have A grades. we are attempting to link build but it is much more difficult for e-commerce than for the facilities pages. Is the SEO of the main site hampering(masking?) the /climbing-shop? All feedback on the whole process would be much appreciated. Thanks!
On-Page Optimization | | benj450 -
Solve duplicate content issues by using robots.txt
Hi, I have a primary website and beside that I also have some secondary websites with have same contents with primary website. This lead to duplicate content errors. Because of having many URL duplicate contents, so I want to use the robots.txt file to prevent google index the secondary websites to fix the duplicate content issue. Is it ok? Thank for any help!
On-Page Optimization | | JohnHuynh0 -
Avoid Multiple Page Title Elements ?
Page titles"GamesChannal – Video Games Review" and "gmaeschannal"ExplanationWeb pages are meant to have a single title, and for both accessibility and search engine optimization reasons, we strongly recommend following this practice.RecommendationRemove all but a single page title element. what and gmaeschannal ???? Site Title "GameChannel - Video Games Review and Latest News Video Review Game" Homepage www.gameschannal.com Help Me !!! 😞
On-Page Optimization | | GamesChannal0 -
Can we listed URL on Website sitemap page which are blocked by Robots.txt
Hi, I need your help here. I have a website, and few pages are created for country specific. (www.example.com/uk). I have blocked many country specific pages from Robots.txt file. It is advisable to listed those urls (blocked by robots.txt) on my website sitemap. (html sitemap page) I really appreciate your help. Thanks, Nilay
On-Page Optimization | | Internet-Marketing-Profs0 -
Does Google respect User-agent rules in robots.txt?
We want to use an inline linking tool (LinkSmart) to cross link between a few key content types on our online news site. LinkSmart uses a bot to establish the linking. The issue: There are millions of pages on our site that we don't want LinkSmart to spider and process for cross linking. LinkSmart suggested setting a noindex tag on the pages we don't want them to process, and that we target the rule to their specific user agent. I have concerns. We don't want to inadvertently block search engine access to those millions of pages. I've seen googlebot ignore nofollow rules set at the page level. Does it ever arbitrarily obey rules that it's been directed to ignore? Can you quantify the level of risk in setting user-agent-specific nofollow tags on pages we want search engines to crawl, but that we want LinkSmart to ignore?
On-Page Optimization | | lzhao0