Sitemaps and Indexed Pages
-
Hi guys,
I created an XML sitemap and submitted it for my client last month.
Now the developer of the site has also been messing around with a few things.
I've noticed on my Moz site crawl that indexed pages have dropped significantly.
Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed?
Thanks
David.
-
Sorry - I missed the part about you looking specifically at the Moz crawler. While useful, it's a stand-in for what will actually be used for rankings - namely the actual crawls by the search engine crawlers themselves. I'd be looking right to the source for that info if you're concerned there's an issue, rather than trusting just Mozbot. You can find the SE crawlers data in Google Search Console and Bing Webmaster Tools. Look for trends and patterns there, especially around the sitemap report.
The challenge to a Screaming Frog-rendered sitemap is that it can only find what's linked. If the site has orphaned pages or an ineffective internal linking scheme, a crawl could easily miss pages. It's certainly better than no sitemap, but a map generated by the site's technology itself (usually the database) is safer.
P.
-
Thanks Paul,
Yes there has been a big clean up of pages. There were over 80,000 to begin with. I managed to get that down to about 14k but then last month MOZ bot only crawled about 4,000 pages.
I was just a bit worried that the sitemap generated by Screaming Frog was incorrect and therefore that was the reason for the drop.
I was referring mainly to the MOZ site crawl. I guess I was worried that the MOZ bot only followed the sitemap!
There were loads of filter URL's and all sorts going on so it's a bit of a spiders web!
-
No - submitting a sitemap won't reduce the crawl of a site. The search engines will crawl the sitemap and add these pages to the index if they consider them worthy. But they'll still also crawl any other links/pages they can find in other ways and index those as well if they consider them worthy.
Note though - having the number of indexed pages drop is not necessarily a bad thing. If removing a large number of worthless/duplicate/canonicalised/no-indexed pages cleans up the site, that will also be reflected in fewer crawled pages - an indication that quality improvement work was effective.
That help?
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Authority 2.0 is coming soon!
UPDATE: We’ve made the tough decision to delay the launch of Page Authority 2.0 as our team investigates an unforeseen issue. **To learn more about the rigorous testing process and updates to the timeline, **check out this blog post. Hey there Moz family, We’re stoked to let you know about an upcoming change to a beloved metric — similar to our refresh of the Domain Authority algorithm last year, we’ve been working on developing an improvement to its sibling metric, Page Authority (also known as “PA”). Page Authority (PA) identifies the strength of a particular page (on a 1-100 scale) and its ability to rank in search results in comparison to other pages. PA is a Moz metric, and while it can be used as a good gauge of page strength and ranking potential, it is not used by search engines to determine ranking outcome. On September 30, 2020, we will release the new and improved Page Authority algorithm that will be updated in a similar fashion to last year’s update to DA. The updated algorithm will take into account more modern ranking factors to make the score even more accurate and predictive. We recognize that the update to the DA algorithm took time to communicate to clients and stakeholders, and we wanted to be sure to give you advance notice again this time around. We’ve created a number of resources to help you understand the what, the why, and the how of this update to PA. Let’s start with a few FAQs that you might be curious about! Why didn’t PA update when DA updated? Although many folks associate DA and PA with one another, the two metrics are calculated independently. We chose to update the two metrics separately in order to take the care that each metric deserved, and to provide the highest quality algorithm updates for the SEO community. Why is Moz changing the PA algorithm? As with our update to the DA algorithm, we want to ensure that you have confidence in our metrics and the predictions that they provide. Data integrity is an integral part of our tools and something that we hold in the highest regard. To be sure that PA can best reflect the potential for a page to rank on the SERP, we’re making the necessary improvements. What can I expect to see from the PA algorithm update? Many pages will see changes to their PA scores as a result of this algorithm update. While the changes to scores may be somewhat minimal, there is a possibility that some pages will see material change to their scores. The new PA algorithm takes into consideration Spam Score and link patterns, in addition to dozens of other factors, so your PA scores may see noticeable change if your pages have spammy or unnatural link patterns. How can I prepare for the update? As with any change to a metric that you know and love, we recommend getting in touch with your stakeholders to let them know of the upcoming update. For those who are used to seeing this metric in your SEO reports, giving them a heads-up will help them to prepare for any fluctuations they might see to PA scores once the new PA algorithm rolls out. We also recommend using this update as an opportunity to educate them on the use of Page Authority and how you might use this refreshed metric for future link building projects. Our resource center has a few helpful pieces of collateral that can support these conversations with stakeholders and clients. Is Page Authority an absolute score or a relative one? Page Authority should always be used as a relative metric, to compare the score of your pages to the scores of other sites’ pages. Link Explorer looks at over 7 trillion pages and 40 trillion links to inform the Page Authority metric that you see. As such, it is always a wise idea to use PA as a comparative score to understand where your page stacks up in comparison to the other pages that are present on the SERPs you care about. Will Domain Authority (DA) be impacted by this update? No, DA will not be affected by this update. This particular algorithm update is specific to Page Authority only. Will API users be affected at the same time? Yes, API users will see the update to Page Authority at the same time as users of Moz Pro. We’d love for you to check out our resource page for links to a slide deck, a whitepaper, and other helpful information. The full announcement blog post from Russ Jones can be found here. Happy to chat with you here in the Q&A thread, or feel free to send an email to help@moz.com with any questions. Best, Igor
API | | IgorJesovnik8 -
Crawler unable to access pages
Hi crawler is unable to access site and crawl properly. Mainly for the backlink checker, it's producing no results There is nothing in the robots.txt file blocking crawler access. Any help is much appreciated as it's driving me crazy!
API | | 2Cubedie0 -
The April Index Update is Here!
Don’t adjust your monitors, or think this is an elaborate April Fool’s joke, we are actually releasing our April Index Update EARLY! We had planned to release our April Index Update on the 6th, but processing went incredibly smoothly and left us the ability to get it up today. Let’s dig into the details of the April Index Release: 138,919,156,028 (139 billion) URLs. 746,834,537 (747 million) subdomains. 190,170,132 (190 million) root domains. 1,116,945,451,603 (1.1 Trillion) links. Followed vs nofollowed links 3.02% of all links found were nofollowed 61.79% of nofollowed links are internal 38.21% are external Rel canonical: 28.14% of all pages employ the rel=canonical tag The average page has 90 links on it 73 internal links on average. 17 external links on average. Don’t let me hold you up, go dive into the data! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand:https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson9 -
March 2nd Mozscape Index Update is Live!
We are excited to announce that our March 2<sup>nd</sup> Index Update is complete and it is looking great! We grew the number of subdomains and root domains indexed, and our correlations are looking solid across the board. Run, don’t walk, to your nearest computer and check out the sweet new data! Here is a look at the finer details: 141,626,596,068 (141 billion) URLs 1,685,594,701 (1 billion) subdomains 193,444,117 (193 million) root domains 1,124,641,982,250 (1.1 Trillion) links Followed vs nofollowed links 3.09% of all links found were nofollowed 62.41% of nofollowed links are internal 37.59% are external Rel canonical: 27.46% of all pages employ the rel=canonical tag The average page has 92 links on it 74 internal links on average 18 external links on average Thanks again! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand:https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson7 -
On Page Grader Problem-Sorry But This Page Inaccessible
Greetings: When I try to use the on page grader and enter my URL, an error message appears stating: "Sorry But This Page Inaccessible". The URL is http://www.nyc-officespace-leader.com/commercial-space/office-space and it works fine when I enter it on my browser. Any page from this domain generates this error. Is there a bug with this tool? How would I go about tracking ranking on various keywords? I see it is possible to tag keywords, and I have done so for about 250. But I don't know how to generate a ranking report for these keywords; ideally I would like to do so filtering them by the label I have applied. Any suggestions? Thanks,
API | | Kingalan1
Alan0 -
On-Page Reports showing old urls
While taking a look at our sites on-page reports I noticed some of our keywords with very old urls that haven't existed for close to a year. How do I make sure moz's keyword ranking is finding the correct page and make sure I'm not getting graded on that keywords/urls that don't exist any more or have been 301'd to new urls? Is there a way to clean these out? My on-page reports say I have 62 reports for only a total of 34 keywords in rankings. As you can see from the image most of the urls for "tax folder" have now been 301'd to not include /product or /category but moz is still showing them with the old url structure. BTW our site is minespress.com 2KdGcPL.png
API | | smines0 -
Huge in crease in on page errors
Hi guys I’ve just checked my online campaign and I see errors in my crawl diagnostics have almost doubled from the 21<sup>st</sup> of October to the 25<sup>th</sup> of October going from 6708 errors to 11 599. Can anyone tell me what may have caused this? Also I notice we have a lot of issues with duplicate page titles which seems strange as no new pages have been added, can anyone explain why this might be? I look forward to hearing from you
API | | Hardley1110