Adding multi-language sitemaps to robots.txt
-
I am working on a revamped multi-language site that has moved to Magento. Each language runs off the core coding so there are no sub-directories per language.
The developer has created sitemaps which have been uploaded to their respective GWT accounts. They have placed the sitemaps in new directories such as:
- /sitemap/uk/sitemap.xml
- /sitemap/de/sitemap.xml
I want to add the sitemaps to the robots.txt but can't figure out how to do it. Also should they have placed the sitemaps in a single location with the file identifying each language:
- /sitemap/uk-sitemap.xml
- /sitemap/de-sitemap.xml
What is the cleanest way of handling these sitemaps and can/should I get them on robots.txt?
-
Adding the following lines to the bottom of your robots.txt should do it:
Sitemap: http://www.example.com/sitemap/uk/sitemap.xml
Sitemap: http://www.example.com/sitemap/de/sitemap.xml
If you wanted to update the file names to be different it wouldn't hurt, but I don't think you would have any problems with how they are currently set up. If you have submitted them to WMT and they are being picked up ok I think you are fine.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Role of Robots.txt and Search Console parameters settings
Hi, wondering if anyone can point me to resources or explain the difference between these two. If a site has url parameters disallowed in Robots.txt is it redundant to edit settings in Search Console parameters to anything other than "Let Googlebot Decide"?
Technical SEO | | LivDetrick0 -
Use hreflang for language and regional URLs
I have implemented hreflang on site, seen here http://www.cobaltrecruitment.com/ but Webmaster Tools is returning loads of errors in the international targeting area..... "'“en-sg"' - unknown language code" and "'“en-ar"' - unknown language code" Can anyone suggest what I need to tell my developers to do? Thanks for your help!
Technical SEO | | the-gate-films0 -
Multi language and multi target eCommerce site in EU
Hi, before I raised this question I red some other topics which were relevant to my question, however every topic had some slight difference between the various scenarios. I run a webshop with .eu domain (notebookscreen.eu) which currently runs on 3 languages. We use geoIP to determine the user's location for the following reasons: language currency shipping fee The site runs very slow during the tests, and most of the testers (including MOZ) fails since it has too many 302 redirects. We are rebuilding this part to fix this redirect and need some advice what is the best to optimise for multiple countries. As said in the title this is a shop mainly targeting EU countries, and next to the .eu domain we have 10 other country level TLD registration. Currently we do subfolder stile. Would it be better to do subdomain per country or separate TLD for every country. Current option is the best for backlinks, I don't think second has any gains. Having dedicated TLDs can help to local SERP for every country, however we would need a lot of back-linking. Also if someone starts with the .eu page, a 3xx redirect is needed for the designated country. Different sites do it differently. Some don't care (Apple), some stay on one page and gives you local currency and shipping rates (eBay), or moves to different TLD (Amazon). Is there any better way to determine someone's location other then GeoIP?
Technical SEO | | kukacwap0 -
No descripton on Google/Yahoo/Bing, updated robots.txt - what is the turnaround time or next step for visible results?
Hello, New to the MOZ community and thrilled to be learning alongside all of you! One of our clients' sites is currently showing a 'blocked' meta description due to an old robots.txt file (eg: A description for this result is not available because of this site's robots.txt) We have updated the site's robots.txt to allow all bots. The meta tag has also been updated in WordPress (via the SEO Yoast plugin) See image here of Google listing and site URL: http://imgur.com/46wajJw I have also ensured that the most recent robots.txt has been submitted via Google Webmaster Tools. When can we expect these results to update? Is there a step I may have overlooked? Thank you,
Technical SEO | | adamhdrb
Adam 46wajJw0 -
Site blocked by robots.txt and 301 redirected still in SERPs
I have a vanity URL domain that 301 redirects to my main site. That domain does have a robots.txt to disallow the entire site as well. However, for a branded enough search that vanity domain still shows up in SERPs and has the new Google message of: A description for this result is not available because of this site's robots.txt I get why the message is there - that's not my , my question is shouldn't a 301 redirect trump this domain showing in SERPs, ever? Client isn't happy about it showing at all. How can I get the vanity domain out of the SERPs? THANKS in advance!
Technical SEO | | VMLYRDiscoverability0 -
Sitemap Generator Tool
We have developed a very large domain with well over 500 pages that need to be indexed. The tool we usually use to create a sitemap has a limit of 500 pages. Does anyone know of good tool we can use to create a sitemap text and xml that doesn't have a limit of pages? Thanks!
Technical SEO | | TracSoft0 -
Timely use of robots.txt and meta noindex
Hi, I have been checking every possible resources for content removal, but I am still unsure on how to remove already indexed contents. When I use robots.txt alone, the urls will remain in the index, however no crawling budget is wasted on them, But still, e.g having 100,000+ completely identical login pages within the omitted results, might not mean anything good. When I use meta noindex alone, I keep my index clean, but also keep Googlebot busy with indexing these no-value pages. When I use robots.txt and meta noindex together for existing content, then I suggest Google, that please ignore my content, but at the same time, I restrict him from crawling the noindex tag. Robots.txt and url removal together still not a good solution, as I have failed to remove directories this way. It seems, that only exact urls could be removed like this. I need a clear solution, which solves both issues (index and crawling). What I try to do now, is the following: I remove these directories (one at a time to test the theory) from the robots.txt file, and at the same time, I add the meta noindex tag to all these pages within the directory. The indexed pages should start decreasing (while useless page crawling increasing), and once the number of these indexed pages are low or none, then I would put the directory back to robots.txt and keep the noindex on all of the pages within this directory. Can this work the way I imagine, or do you have a better way of doing so? Thank you in advance for all your help.
Technical SEO | | Dilbak0 -
Impact of Adding a Mobile Site
Hi, we ranked very well for keywords trophies and trophies and awards on our home page, trophycentral.com for quite a while (many years). Recently we dropped off the charts, but are not sure why. So we posted this issue last week and got some great suggestions and are in the process of addressing them. However, we are now wondering if we caused this issue when we launched our mobile site a few months ago (timing makes sense). Has anyone had trouble with a mobile site impacting their traditional site? I am wondering if maybe google is splitting the traffic to the trophycentral domain and the m.trophycentral.com domain? Here is the code we have< script type="text/javascript" src="http://lib.store.yahoo.net/lib/sportsawards/mobile-redirection.js">script>Appreciate your comments!
Technical SEO | | trophycentraltrophiesandawards0