Clarification on indexation of XML sitemaps within Webmaster Tools
-
Hi Mozzers,
I have a large service based website, which seems to be losing pages within Google's index. Whilst working on the site, I noticed that there are a number of xml sitemaps for each of the services. So I submitted them to webmaster tools last Friday (14th) and when I left they were "pending".
On returning to the office today, they all appear to have been successfully processed on either the 15th or 17th and I can see the following data:
13/08 - Submitted=0 Indexed=0
14/08 - Submitted=606,733 Indexed=122,243
15/08 - Submitted=606,733 Indexed=494,651
16/08 - Submitted=606,733 Indexed=517,527
17/08 - Submitted=606,733 Indexed=517,498Question 1: The indexed pages on 14th of 122,243 - Is this how many pages were previously indexed? Before Google processed the sitemaps? As they were not marked processed until 15th and 17th?
Question 2: The indexed pages are already slipping, I'm working on fixing the site by reducing pages and improving internal structure and content, which I'm hoping will fix the crawling issue. But how often will Google crawl these XML sitemaps?
Thanks in advance for any help.
-
Hi again
This means that because you have multiple sitemaps, Google is going to crawl those at different times possibly and at different rates, hence some of your sitemaps taking a day longer. I really wouldn't look into it too much, and just be assured that Google is crawling your sitemaps fine and indexing.
If you notice major discrepancies in what you submitted and what's being indexed, then I would refer to this Google resource on how to fix issues or errors you find in your sitemap crawl.
Hope this helps! Good luck!
-
Hi there
You submitted on the 13th, there were 0 pages indexed. The next day there were 122,243, so in that time period, Google indexed 122,243 of your site's pages.
This is a day by day process. So whatever new number appears on each day, subtract the previous day's number from your present day number, and that's how many pages were freshly indexed.
Hope this helps! Good luck!
-
Just checked webmaster tools again, and now they (sitemaps) all say processed on 17th and some say 18th (today) does this mean the sitemaps are being processed by Google every couple of days?
-
Hi Patrick,
Thanks for elaborating on question 2.
Question 1, I asked if the number (122,243) was how many pages were in the index** before** google processed the sitemaps, as they don't appear to have been processed until the following day.
You answered, yes but then said its how many pages were processed that day?
Thanks again for your time and clarification.
-
Hi there
Question 1 - Yes, this is how many pages Google indexed from your sitemap on that day.
Question 2 - XML sitemaps allow you to tell Google the change frequency of your URLs - you can learn more about that here. Also, according to Google:
Google's spiders regularly crawl the web to rebuild our index. Crawls are based on many factors such as PageRank, links to a page, and crawling constraints such as the number of parameters in a URL. Any number of factors can affect the crawl frequency of individual sites.
Our crawl process is algorithmic; computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. We don't accept payment to crawl a site more frequently. For tips on maintaining a crawler-friendly website, please visit our Webmaster Guidelines.
Please let me know if you have any further questions or comments.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Indexed But Not Submitted to Sitemap
Hi guys, In Google's webmaster tool it says that the URL has been indexed but not submitted to the sitemap. Is it necessary that the URL be submitted to the sitemap if it has already been indexed? Appreciate your help with this. Mark
Technical SEO | | marktheshark100 -
Hreflang in country specific XML Sitemaps?
Hello! I'm rolling out hreflang tags in my client's "main" XML Sitemap. My question is: do we need to implement these tags in the country level XML Sitemaps also? Thanks!
Technical SEO | | SimpleSearch1 -
Webmaster Tools - Clarification of what the top directory is in a calender url
Hi all, I had an issue where it turned out a calender was used on my site historically (a couple of years ago) but the pages were still present, crawled and indexed by google to this day. I want to remove them now from the index as it really clouds my analysis and as I have been trying to clean things up e.g. by turning modules off, webmaster tools is throwing up more and more errors due to these pages. Below is an example of the url of one of the pages: http://www.example.co.uk/index.php?mact=Calendar,m1a033,default,1&m1a033year=2084&m1a033month=3&m1a033returnid=59&page=59?phpMyAdmin=xxyyzz The closest question I have found on the topic in Seomoz is: http://www.seomoz.org/q/duplicate-content-issue-6 I want to remove all these pages from the index by targeting their top level folder. From the historic question above would I be right in saying that it is: http://www.example.co.uk/index.php?mact=Calendar I want to be certain before I do a directory level removal request in case it actually targets index.php instead and deindexes my whole site (or homepage at the very least). Thanks
Technical SEO | | Mitty0 -
AJAX and Bing Indexation
Hello. I've been going back and forth with Bing technical support regarding a crawling issue on our website (which I have to say is pretty helpful - you do get a personal, thoughtful response pretty quickly from Bing). Currently our website is set with a java redirect to send users/crawlers to an AJAX version of our website. For example, they come into - mysite.com/category..and get redirected to mysite.com/category#!category. This is to provide an AJAX search overlay which improves UEx. We are finding that Bing gets 'hung up' on these AJAX pages, despite AJAX protocol being in place. They say that if the AJAX redirect is removed, they would index and crawl the non-AJAX url correctly - at which point our indexation would (theoretically) improve. I'm wondering if it's possible (or advisable) to direct the robots to crawl the non-AJAX version, while users get the AJAX version. I'm assuming that it's the classic - the bots want to see exactly what the users see - but I wanted to post here for some feedback. The reality of the situation is the AJAX overlay is in place and our rankings in Bing have plummeted as a result.
Technical SEO | | Blenny0 -
Sitemap.xml showing up in Google Search
Hello when I do a Google search my sitemap.xml shows up for lots of queries. Does anyone have any advise on this? Should I remove url in Google Webmaster? Thanks,
Technical SEO | | Socialdude0 -
Webmaster Tools 404 Errors Pages Never Created
Recently, 196 404 errors appeared in my WMT account for pages that were never created on my site. Question: Any thoughts on how they got there (i.e. WMT bug, tactic by competitor)? Question: Thoughts on impact if any? Question: Thoughts on resolution?
Technical SEO | | Gyi0 -
Partial mobile sitemap
Hi, We have a main www website with a standard sitemap. We also have a m. site for mobile content (but m. is only for our top pages and doesn't include the entire site). If a mobile client accesses one of our www pages we redirect to the m. page. If we don't have a m. version we keep them on the www site. Currently we block robots from the mobile site. Since our m. site only contains the top pages, I'm trying to determine the boost we might get from creating a mobile sitemap. I don't want to create the "partial" mobile sitemap and somehow have it hurt our traffic. Here is my plan update m. pages to point rel canonical to appropriate www page (makes sure we don't dilute SEO across m. and www.) create mobile sitemap and allow all robots to access site. Our www pages already rank fairly highly so just want to verify if there are any concerns since m. is not a complete version of www?
Technical SEO | | NicB10 -
To host within my city or not?
I'm opening up a web development company in Vancouver and I'm stuck between getting hosting in Montreal or Vancouver. Montreal is a cheaper but, my company is in Vancouver. I have a .ca which will be my domain name, so I'm already on top of that aspect. From what I understand it would help for SEO purposes if my IP is a local Vancouver IP for my company website. So my question is... Should I go for the Vancouver hosting and pay more or stick with the Montreal hosting.
Technical SEO | | VebianWebandMobileDevelopment1