Clarification on indexation of XML sitemaps within Webmaster Tools
-
Hi Mozzers,
I have a large service based website, which seems to be losing pages within Google's index. Whilst working on the site, I noticed that there are a number of xml sitemaps for each of the services. So I submitted them to webmaster tools last Friday (14th) and when I left they were "pending".
On returning to the office today, they all appear to have been successfully processed on either the 15th or 17th and I can see the following data:
13/08 - Submitted=0 Indexed=0
14/08 - Submitted=606,733 Indexed=122,243
15/08 - Submitted=606,733 Indexed=494,651
16/08 - Submitted=606,733 Indexed=517,527
17/08 - Submitted=606,733 Indexed=517,498Question 1: The indexed pages on 14th of 122,243 - Is this how many pages were previously indexed? Before Google processed the sitemaps? As they were not marked processed until 15th and 17th?
Question 2: The indexed pages are already slipping, I'm working on fixing the site by reducing pages and improving internal structure and content, which I'm hoping will fix the crawling issue. But how often will Google crawl these XML sitemaps?
Thanks in advance for any help.
-
Hi again
This means that because you have multiple sitemaps, Google is going to crawl those at different times possibly and at different rates, hence some of your sitemaps taking a day longer. I really wouldn't look into it too much, and just be assured that Google is crawling your sitemaps fine and indexing.
If you notice major discrepancies in what you submitted and what's being indexed, then I would refer to this Google resource on how to fix issues or errors you find in your sitemap crawl.
Hope this helps! Good luck!
-
Hi there
You submitted on the 13th, there were 0 pages indexed. The next day there were 122,243, so in that time period, Google indexed 122,243 of your site's pages.
This is a day by day process. So whatever new number appears on each day, subtract the previous day's number from your present day number, and that's how many pages were freshly indexed.
Hope this helps! Good luck!
-
Just checked webmaster tools again, and now they (sitemaps) all say processed on 17th and some say 18th (today) does this mean the sitemaps are being processed by Google every couple of days?
-
Hi Patrick,
Thanks for elaborating on question 2.
Question 1, I asked if the number (122,243) was how many pages were in the index** before** google processed the sitemaps, as they don't appear to have been processed until the following day.
You answered, yes but then said its how many pages were processed that day?
Thanks again for your time and clarification.
-
Hi there
Question 1 - Yes, this is how many pages Google indexed from your sitemap on that day.
Question 2 - XML sitemaps allow you to tell Google the change frequency of your URLs - you can learn more about that here. Also, according to Google:
Google's spiders regularly crawl the web to rebuild our index. Crawls are based on many factors such as PageRank, links to a page, and crawling constraints such as the number of parameters in a URL. Any number of factors can affect the crawl frequency of individual sites.
Our crawl process is algorithmic; computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. We don't accept payment to crawl a site more frequently. For tips on maintaining a crawler-friendly website, please visit our Webmaster Guidelines.
Please let me know if you have any further questions or comments.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonicalization, does it still index
If I have 2 pages that are identical but on different domains that our team manages, if we place a rel=canonical tag on the page we prefer/should display, will the page that doesn't have the canonical tag still be indexed and show on SERPs?
Technical SEO | | kroe10 -
Index problems
“The website http://www.vaneyckshutters.com/nl/ does not show in the index of Google (site:vaneyckshutters.com/nl/). This must be the homepage in the Netherlands. Previously, the page www.vaneyckshutters.com was redirected to /nl/. This page is accessible now with a canonical tag to http://www.vaneyckshutters.com/nl/ in the hope to let /nl/ be indexed. When we look at the SERPS for keyword ‘shutters’, the page http://www.vaneyckshutters.com/ is shown in Google.nl on #32 and in Belgium #3. Problem & question: Why is it that /nl/ has not been indexed properly and why is it that we rank with http://www.vaneyckshutters.com on ‘shutters’ instead the/nl/ page?”
Technical SEO | | Happy-SEO1 -
Is there a maximum sitemap size?
Hi all, Over the last month we've included all images, videos, etc. into our sitemap and now its loading time is rather high. (http://www.troteclaser.com/sitemap.xml) Is there any maximum sitemap size that is recommended from Google?
Technical SEO | | Troteclaser0 -
Is a Rel="cacnonical" page bad for a google xml sitemap
Back in March 2011 this conversation happened. Rand: You don't want rel=canonicals. Duane: Only end state URL. That's the only thing I want in a sitemap.xml. We have a very tight threshold on how clean your sitemap needs to be. When people are learning about how to build sitemaps, it's really critical that they understand that this isn't something that you do once and forget about. This is an ongoing maintenance item, and it has a big impact on how Bing views your website. What we want is end state URLs and we want hyper-clean. We want only a couple of percentage points of error. Is this the same with Google?
Technical SEO | | DoRM0 -
Google webmaster errors
**If you know what these google webmasters errors mean, and you can explain it to me in simple english and tell me how I can locate the problem, I would really appreciate it!. <colgroup><col width=""><col width=""><col width=""><col width=""><col width="*"><col width="124"><col width="54"></colgroup>
Technical SEO | | Joseph-Green-SEO
| | | | | Server error | | | | Soft 404 | | | | Access denied | | Not found | | | Not followed | | | |** I have many of these errors, is it harming SEO?Yoseph0 -
Google Webmaster Tool - Crawl Stats Query ?
Dear All, I have been looking at GWT Crawl Stats and wondering how should I be interrupting the crawl stats chart. AllI I see is 3 charts telling me a high , low and average for the below but I am wondering is there anything I really need to be looking for ?. Pages crawled per day Kilobytes downloaded per day Time spent downloading a page (in milliseconds) thanks Sarah
Technical SEO | | SarahCollins0 -
Webmaster Tools finding phantom 404s?
We recently (three months now!) switched over a site from .co.uk to .com and all old urls are re-directing to the new site. However, Google Webmaster tools is flagging up hundreds of 404s from the old site and yet doesn't report where the links were found, i.e. in the 'Linked From' tab there is no data and the old links are not in the sitemap. SEOmoz crawls do not report any 404s. Any ideas?
Technical SEO | | Switch_Digital0