Wild fluctuation in number of pages crawled
-
I am seeing huge fluctuations in the number of pages discovered the crawl each week. Some weeks the crawl discovers > 10,000 pages and other weeks I am seeing 4-500.
So, this week for example I was hoping to see some changes reflected for warnings from last weeks report (which discovered > 10,000 pages). However, the entire crawl this week was 448 pages.
The number of pages discovered each week seems to go back and forth between these two extremes. The more accurate count would be nearer the 10,000 mark than the 400 range.
Thanks.
Mark
-
No problem!
Glad to see Cyrus' response!
-
Hi Mark,
I used to troubleshoot these types of problems (mysteries
when I worked on the SEOmoz help team.
The best thing to do would be contact the Help Team (help@seomoz.org) and include information both your account, url and campaign. They can take this information and see if there is anything odd about your website, or if there is a bug in the crawling software, or finally if there is some strange quirk of incompatibility causing this behavior.
If you would rather, you can PM me with the info and I can try to troubleshoot it myself, but the Help Team has a few more tools and access to engineers, so they might be the better choice. Either way, let us know if you have any trouble.
-
Thank you for the response. I should have been more clear. It is the weekly SEOMoz crawl that is showing such inconsistent behavior, not Google. Sorry I wasn't more clear.
We have very few (if any) broken links, errors, etc.
Thanks.
Mark
-
Hi there Mark!
We used to have the same issue using Joomla here. It turns out that Google will reduce their crawling if your site has too many errors, broken links, and so on.
We used GWT to look into the 404's then redirected the broken links. Afterwards, we resubmitted the site to be reindexed. A few weeks later -VOILA- all is back to normal and our page freshness stays where it should.
I'd recommend looking at your GWT first, and fixing broken links followed by resubmission to SE's...
Good Luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did Moz crawl our development site?
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues. What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further. How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again? Thanks!
Moz Pro | | MultiTimeMachine0 -
Unable to get into top 20 even when pages are optimized and most crawl issues resolved
I have a few keyword phrases I've been trying to rank in the top 20 for (starting place). I have optimized for a few different phrases, ranging in keyword difficulty, but no matter what I do I can't seem to get in. In many cases, the exact same results show up for many different variations of the phrases I'd like to rank for. I've read about how google tries to match user intent and so if it decides those results are more relevant then it will always show them, but does that mean that no matter what I do I will always be behind them? The main question I have is: how should I proceed? Should I stop optimizing pages and focus on link acquisition? Or go through and make sure there isn't a single crawl issue? Or focus on optimizing for longer tail keyword phrases? It just feels like I've done so much of what the moz tools have recommended and I'm seeing very little movement over the past couple of months, if anything I see dips in performance after optimization. Thanks in advance!
Moz Pro | | Dynata_panel_marketing1 -
How often does seomoz crawl the site? Can you force a crawl at a specific time ?
How often does seomoz crawl the site? Can you force a crawl at a specific time ?
Moz Pro | | stewbuch18720 -
How long is a full crawl?
It's been now over 3 days that the dashboard for one of our campaigns shows "Next Crawl in Progress!". I am not complaining about the length... but I have to agree that SEOMoz is quite addictive, and it's quite frustrating to see that everyday 🙂 Thanks
Moz Pro | | jgenesto0 -
Duplicate page error from SEOmoz
SEOmoz's Crawl Diagnostics is complaining about a duplicate page error. I'm trying to use a rel=canonical but maybe I'm not doing it right. This page is the original, definitive version of the content: https://www.borntosell.com/covered-call-newsletter/sent-2011-10-01 This page is an alias that points to it (each month the alias is changed to point to the then current issue): https://www.borntosell.com/covered-call-newsletter/latest-issue The alias page above contains this tag (which is also updated each month when a new issue comes out) in the section: Is that not correct? Is the https (vs http) messing something up? Thanks!
Moz Pro | | scanlin0 -
How do you get Mozbot to crawl your website
I trying to get the mozbot to crawl my site so I can get new crawl diagnostics info. Anyone know how this can be done?
Moz Pro | | Romancing0 -
Is there any way to manually initiate a crawl through SEOMoz?
... or do you actually have to wait a week for the next scheduled crawl date on a particular campaign? We've just made a ton of changes to our site, and it would be helpful to know if they will generate any warnings or errors sooner rather than later. Thanks!
Moz Pro | | jadeinteractive1 -
Initial Crawl Questions
Hello. I just joined and used the Crawl tool. I have many questions and hoping the community can offer some guidance. 1. I received an Excel file with 3k+ records. Is there a friendly online viewer for the Crawl report? Or is the Excel file the only output? 2. Assuming the Excel file is the only output, the Time Crawled is a number (i.e. 1305798581). I have tried changing the field to a date/time format but that did not work. How can I view the field as a normal date/time such as May 15, 2011 14:02? 3. I use the ™ symbol in my Title. This symbol appears in the output as a few ascii characters. Is that a concern? Should I remove the trademark symbol from my Title? 4. I am using XenForo forum software. All forum threads automatically receive a Title Tag and Meta Description as part of a template. The Crawl Test report shows my Title Tag and Meta Description as blank for many threads. I have looked at the source code of several pages and they all have clean Title tags and I don't understand why the Crawl Report doesn't show them. Any ideas? 5. In some cases the HTTP Status Code field shows a result of "3". Why does that mean? 6. For every URL in the Crawl Report there is an entry in the Referrer field. What exactly is the relationship between these fields? I thought the Crawl Tool would inspect every page on the site. If a page doesn't have a referring page is it missed? What if a page has multiple referring pages? How is that information displayed? 7. Under Google Webmaster Tools > Site Configurations > Settings > Parameter Handling I have the options set as either "Ignore" or "Let Google Decide" for various URL parameters. These are "pages" of my site which should mostly be ignored. For example a forum may have 7 headers, each on of which can be sorted in ascending or descending order. The only page that matters is the initial page. All the rest should be ignored by Google and the Crawl. Presently there are 11 records for many pages which really should only have one record due to these various sort parameters. Can I configure the crawl so it ignores parameter pages? I am anxious to get started on my site. I dove into the crawl results and it's just too messy in it's present state for me to pull out any actionable data. Any guidance would be appreciated.
Moz Pro | | RyanKent0