I have two sitemaps which partly duplicate - one is blocked by robots.txt but can't figure out why!
-
Hi, I've just found two sitemaps - one of them is .php and represents part of the site structure on the website. The second is a .txt file which lists every page on the website. The .txt file is blocked via robots exclusion protocol (which doesn't appear to be very logical as it's the only full sitemap). Any ideas why a developer might have done that?
-
There are standards for the sitemaps .txt and .xml sitemaps, where there are no standards for html varieties. Neither guarantees the listed pages will be crawled, though. HTML has some advantage of potentially passing pagerank, where .txt and .xml varieties don't.
These days, xml sitemaps may be more common than .txt sitemaps but both perform the same function.
-
yes, sitemap.txt is blocked for some strange reason. I know SEOs do this sometimes for various reasons, but in this case it just doesn't make sense - not to me, anyway.
-
Thanks for the useful feedback Chris - much appreciated - Is it good practice to use both - I guess it's a good idea if onsite version only includes top-level pages? PS. Just checking nature of block!
-
Luke,
The .php one would have been created as a navigation tool to help users find what they're looking for faster, as well as to provide html links to search engine spiders to help them reach all pages on the site. On small sites, such sitemaps often include all pages of the site, on large ones, it might just be high level pages. The .txt file is non html and exists to provide search engines with a full list of urls on the site for the sole purpose of helping search engines index all the site's pages.
The robots.txt file can also be used to specify the location of the sitemap.txt file such as
sitemap: http://www.example.com/sitemap_location.txt
Are you sure the sitemap is being blocked by the robots.txt file or is the robots.txt file just listing the location of the sitemap.txt?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it possible to rank in google mexico when you don't have a local site?
Hello, someone is asking me why we don't rank in google mexico search engine. I mentioned we don't have a google mexico site, but have a USA site, so we may rank, but not as well as if we had the mexico site. IS there anyway to improve rankings or tips? THanks! Laura Robinson
Intermediate & Advanced SEO | | lauramrobinson321 -
Trouble Indexing one of our sitemaps
Hi everyone thanks for your help. Any feedback is appreciated. We have three separate sitemaps: blog/sitemap.xml events.xml sitemap.xml Unfortunately we keep trying to get our events sitemap to pickup and it just isn't happening for us. Any input on what could be going on?
Intermediate & Advanced SEO | | TicketCity0 -
How to solve outbound broken links? Those don't exist now?
There are many, many broken links on the website. What normal strategy to use for that? http://www.txacspecialist.com/air-conditioning-equipment-service-austin/american-standard/ It's an AC site, so all the links to AC vendors who have changed their product pages, all of those links are broken So for instance, the carrier 20xl doesn't exist anymore. Now they sell the carrier 45abp. We link carrier 20xl and now the page and AC model is not exist. So what I can do to solve the broken link issue?
Intermediate & Advanced SEO | | bondhoward0 -
We sold our site's domain and have a new one. Where do we go from here?
We recently sold our established domain -- for a compelling price -- and now have the task of transitioning to our new domain. What steps would you recommend to lesson the anticipated decline from search engines in this scenario?
Intermediate & Advanced SEO | | accessintel0 -
Robots.txt vs noindex
I recently started working on a site that has thousands of member pages that are currently robots.txt'd out. Most pages of the site have 1 to 6 links to these member pages, accumulating into what I regard as something of link juice cul-d-sac. The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search. Wouldn't it be better to "noindex, follow" these pages and remove the robots.txt block from this url type? At least that way Google could crawl these pages and pass the link juice on to still other pages vs flushing it into a black hole. BTW, the site is currently dealing with a hit from Panda 4.0 last month. Thanks! Best... Darcy
Intermediate & Advanced SEO | | 945010 -
.com Outranking my ccTLD's and cannot figure out why.
So I have a client that has a number of sites for a number of different countries with their specific ccTLD. They also have a .com in the US. The problem is that the UK site hardly ranks for anything while the .com ranks for a ton in the UK. I have setup GWT for the UK and the .com to be specific to their geographic locations. So I have the ccTLD and I have GWT showing where I want these sites to rank. Problem is it apparently is not working....Any clues as to what else I could do?
Intermediate & Advanced SEO | | DRSearchEngOpt0 -
What content should I block in wodpress with robots.txt?
I need to know if anyone has tips on creating a good robots.txt. I have read a lot of info, but I am just not clear on what I should allow and not allow on wordpress. For example there are pages and posts, then attachments, wp-admin, wp-content and so on. Does anyone have a good robots.txt guideline?
Intermediate & Advanced SEO | | ENSO0 -
Can't find my site on Bing, since ages
Hi Guys, Well, the problem seems normal but I guess it's not. I have tried many things, and nothing changed it, now I give it last try... ask so maybe you will help me. The problem is.. I can't find my site nowhere in Bing, I mean nowhere by not in first 20 pages for my keywords "beauty tips" and the site is: http://www.beauty-tips.net/. In my opinion it should be pretty high... maybe it's too high so I can't see it ;). I never had special problems with Bing, was easier to be there "somewhere" than in google, but with this one is totally opposite. Any ideas? Thanks for your time!
Intermediate & Advanced SEO | | Luke220