Sitemaps and Indexed Pages
-
Hi guys,
I created an XML sitemap and submitted it for my client last month.
Now the developer of the site has also been messing around with a few things.
I've noticed on my Moz site crawl that indexed pages have dropped significantly.
Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed?
Thanks
David.
-
Thanks Eli!
I guess I was wondering if the MOZ Bot only followed pages that were in the sitemap. It was generated by Screaming Frog I have trusted it to include all relevant pages!
I have put in a more detailed description in the response below. Overall I need to investigate further but i'm satisfied that the sitemap has not caused the drop!
-
Thanks Martijn!
I guess I was wondering if the MOZ Bot only followed pages that were in the sitemap. It was generated by Screaming Frog I have trusted it to include all relevant pages!
To elaborate.
There were about 80,000 pages and I used canonical, no index, and redirects to clean up a rather large mess of filter URL's and dup content.
That dropped the pages to about 14k. Then I submitted the sitemap last month and now the crawl only found 4k pages.
Further investigation is needed on my behalf but I wanted to double check that this sudden drop was not because of a sitemap! Thanks for clarifying that!
-
Hi David,
Messing up, Changing or Updating, Deleting a Sitemap is not necessarily something that will decrease the number of ranked or crawled pages. It usually is used a signal to find new pages and figure out if old ones are deleted. But the chances that your sitemap have had a significant impact in what kind of pages went down is something I would find unlikely. It could happen though that you'd see the opposite, an increase in pages indexed/submitted/crawled after you submit a sitemap.
Martijn.
-
Hey David!
Thanks for reaching out to us!
Unfortunately I am not an SEO consultant / Web Developer so I cannot offer specific advice, but I'm sure there are loads of members here who would love to help and have a lot more knowledge than I do! A few things I have picked up which may help are the following:
Try to determine when the drop started, did it drop when you submitted the XML sitemap or when the developer changed certain things? This could help point to the reason for the drop in indexing. There are a variety of reasons as to why Google may not choose to index pages, however some of the common ones are:
-
Check your robots.txt to ensure those pages are still crawlable
-
Check for duplicate content / was there any canonical changes?
-
One of the tools you could use to help keep track of ranking fluctuations is mozcast (http://mozcast.com/). Was there turbulence in the Google algorithm when the indexed pages dropped significantly?
If you want us to have a look at your specific campaign to investigate further could you please pop an email over to help@moz.com.
Thanks!
Eli
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Still not got any index update data.
Is anyone finding that they haven't got the results of the update yet? I have tried some competitors and they are not updated either.
API | | AHC_SEO0 -
First Mozscape index of the year is live
I'm happy to announce, the first index of the year is out. We did have a smaller count of subdomains, but correlations are generally up and coverage of what's in Google looks better, too. We're giving that one a high five! We've (hopefully) removed a lot of foreign and spam subdomains, which you might see reflected in your spam links section. (another woot!) Here are some details about this index release: 145,549,223,632 (145 billion) URLs 1,356,731,650 (1 billion) subdomains 200,255,095 (200 million) root domains 1,165,625,349,576 (1.1 Trillion) links Followed vs nofollowed links 3.17% of all links found were nofollowed 63.49% of nofollowed links are internal 36.51% are external Rel canonical: 26.50% of all pages employ the rel=canonical tag The average page has 89 links on it 72 internal links on average 17 external links on average Thanks! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand: https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores.
API | | jennita5 -
In lue of the canceled Moz Index update
Hey Moz, Overall we love your product and are using it daily to help us grow, part of that has been to rely on the Moz Index for DA and PA as well as places where we are doing positive linking through genuine partnerships and reviews of clients. We were really excited to see any the results for this month as we have been partner linked from lots of high reputation sites and google seems to agree as our rankings are moving up weekly. The question from our marketing team is, since a significant part of Moz will not be available to us this month, will there be any compensation handed out to the paying community. PS: I am an engineer and I know how you have probably lost a very large set of data which cant simply be re-crawled over night but Moz Pro is not a cheap product and we do expect it to work. Source: https://moz.com/products/api/updates Kind Regards.
API | | SundownerRV0 -
August 3rd Mozscape Index Update (our largest index, but nearly a monthly late)
Update 5:27pm 8/4 - the data in Open Site Explorer is up-to-date, as is the API and Mozbar. Moz Analytics campaigns are currently loading in the new data, and all campaigns should be fully up-to-date by 4-10pm tomorrow (8/5). However, your campaign may have the new data much earlier as it depends on where that campaign falls in the update ordering. Hey gang, I wanted to provide some transparency into the latest index update, as well as give some information about our plans going forward with future indices. The Good News: This index, now that it's delivered, is pretty impressive. Mozscape's August index is 407 Billion URLs in size, nearly 100 Billion (~25%) bigger than our last record index size. We indexed 2.18 trillion links for the first time ever (prior record was 1.54 trillion). Correlations for Page Authority have gone up from 0.319 to 0.333 in the latest index, suggesting that we're getting a slightly more accurate representation of Google's use of links in rankings from this data (DA correlations remain constant at 0.185) Our hit ratio for URLs in Google's SERPs has gone up considerably, from 69.97% in our previous index to 78.66% in the August update. This indicates we are crawling and indexing more of what Google shows in the search results (a good benchmark for us). Note that a large portion of what's missing will be things published in the last 30-60 days while we were processing the index (after crawling had stopped). The Bad News: August's index was late by ~25 days. We know that reliable, consistent, on-time Mozscape updates are critically important to everyone who uses Moz's products. We've been working hard for years to get these to a better place, but have struggled mightily. Our latest string of failures was completely new to the team - a bunch of problems and issues we've never seen before (some due to the index size, but many due to odd things like a massive group of what appear to be spam domains using the Palau TLD extension clogging up crawl/processing, large chunks of pages we crawled with 10s of thousands of links which slow down the MozRank calculations, etc). While there's no excuse for delays, and we don't want to pass these off as such, we do want to be transparent about why we were so late. Our future plans include scaling back the index sizes a bit, dealing with the issues around spam domains, large link-list pages, some of the odd patterns we see in .pl and .cn domains, and taking one extra person from the Big Data team off of work on the new index system (which will be much larger and real-time rather than updated every 30 days) to help with Mozscape indices. We believe these efforts, and the new monitoring systems we've got will help us get better at producing high quality, consistent indices. Question everyone always asks: Why did my PA/DA change?! There are tons of reasons why these can change, and they don't necessarily mean anything bad about your site, your SEO efforts, or whether your links are helping you rank. PA and DA are predictive, correlated metrics that say nothing about how you're actually performing. They merely map better than most metrics to Google's global rankings across large SERP sets (but not necessarily your SERPs, which is what you should care about). That said, here's some of the reasons PA/DA do shift: The domains/pages with the highest PA/DA scores gain even faster than most of the domains below them, making it harder each index to get higher scores (since PA/DA are on a logarithmic scale, this is smoothed out somewhat - it would be much worse on a conventional scale, e.g. Facebook.com 100, everyone else 0.0003). Google's ranking algorithm introduces new elements, changes, modifies what they care about, etc. Moz crawls a set of the web that does or doesn't include the pages that are more likely to point to a given domain than another. Although our crawl tends to be representative, if you've got lots of links from deep pages on less popular domains in a part of the web far from the mainstream, we may not consistently crawl those well (or, we could overcrawl your sector because it recently received powerful links from the center of the web). My advice, as always, is to use PA/DA as relative scores. If your scores are falling, but your competitors' are falling more, that's not a bad thing. If your scores are rising, but your competitors' are rising faster, they're probably gaining ground on you. And, if you're talking about score changes in the 1-4 points range, that's not necessarily anything but noise. PA/DA scores often shift 1-4 points up or down in a new index so don't sweat it! Let me know if you've got more questions and I'll do my best to answer. You can also refer to the API update page here: https://moz.com/products/api/updates
API | | randfish8 -
Does on-page grader have an API ?
Hi, I would very much like to include the on-page grader output into my SEO tools. Is there an API for that? thanks James
API | | KMdayJob0 -
3 result limit to Top Pages API call
I am using the MOZ API to make calls for the top pages for a particular URL. However, when I pass in any limit value greater than 3 the API only returns 3 results. I have even tried to put in URLs like 'www.moz.com' and still only 3 results. Sample call to the API below: http://lsapi.seomoz.com/linkscape/top-pages/www.moz.com?AccessID=member-xxxxxxxxx&Expires=1419020831&Signature=xxxxxxxxx&Cols=2052&Offset=0&Limit=50
API | | solodev0 -
On-Page Reports showing old urls
While taking a look at our sites on-page reports I noticed some of our keywords with very old urls that haven't existed for close to a year. How do I make sure moz's keyword ranking is finding the correct page and make sure I'm not getting graded on that keywords/urls that don't exist any more or have been 301'd to new urls? Is there a way to clean these out? My on-page reports say I have 62 reports for only a total of 34 keywords in rankings. As you can see from the image most of the urls for "tax folder" have now been 301'd to not include /product or /category but moz is still showing them with the old url structure. BTW our site is minespress.com 2KdGcPL.png
API | | smines0 -
Huge in crease in on page errors
Hi guys I’ve just checked my online campaign and I see errors in my crawl diagnostics have almost doubled from the 21<sup>st</sup> of October to the 25<sup>th</sup> of October going from 6708 errors to 11 599. Can anyone tell me what may have caused this? Also I notice we have a lot of issues with duplicate page titles which seems strange as no new pages have been added, can anyone explain why this might be? I look forward to hearing from you
API | | Hardley1110