Why might Google be crawling via old sitemap, when the new one has been submitted and verified?
-
We have recently relaunched Scoutzie.com and re-submitted our new sitemap to Google. When I look on Webmaster tools, our new sitemap has been submitted just fine, but at the same time, Google is finding a lot of 404s when crawling the site. My understanding, it is still using crawling the old links, which do not exists. How can I tell Google to refresh it's index and to stop looking at all the old links?
-
Yes it should. However, as Alan mentioned below, if you still have links pointing to the 404 pages, Google will always attempt to crawl them, and will keep you informed that you have errors.
If you do have external links to those 404 pages, you can 301 redirect them to an appropriate page using .htaccess. This way you'll keep the link value and also get rid of the Webmaster Tools error.
If you don't have any links to them, then yes, Google will eventually stop trying to crawl them.
-
It's very likely that we do. Given that I cannot track down a 1000+ links that now 404, will they eventually fall out by themselves, or do I have to tell Google that everything that's 404'ed should be dropped from crawl index? Thanks!
-
What if I simply pushed the new sitemap over the old one? In other words, scoutzie.com/sitemap is the same link, except now it contains the new map. That should be okay, right?
-
you may still have links pointing to those 404 pages on your site or externally. If not then eventually they will fall out of the index
-
Hey scoutzie,
This is actually covered pretty well in Joe Robison's blog post on fixing Webmaster Tools crawl errors: http://moz.com/blog/how-to-fix-crawl-errors-in-google-webmaster-tools
I'll quote the related info:
"One frustrating thing that Google does is it will continually crawl old sitemaps that you have since deleted to check that the sitemap and URLs are in fact dead. If you have an old sitemap that you have removed from Webmaster Tools, and you don’t want being crawled, make sure you let that sitemap 404 and that you are not redirecting the sitemap to your current sitemap."
Hope this helps, good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to track data from old site and new site with the same URL?
We are launching a new site within the next 48 hours. We have already purchased the 30 day trial and we will continue to use this tool once the new site is launched. Just looking for some tips and/or best practices so we can compare the old data vs. the new data moving forward....thank you in advance for your response(s). PB3
Moz Pro | | Issuer_Direct0 -
Is it possible to block Moz from crawling sites?
Hi, is it possible to stop Moz from crawling a site at the server level? Not that I am looking to do this or anything, but here's why I'm asking. I have been crawling a site that is managed (currently by 2 parties), and I noticed that this week pages crawled went from 80 (last week) to 1 page!! I know, what? See my image attached... and the issues all went to zero "0"....! So is it possible that someone can't prevent Moz from crawling the site at the server level? I checked the robots.txt file on the site, but nothing there. I'm curious. dYNUwjd.jpg
Moz Pro | | co.mc0 -
Do I need a new moz campaign for a subdomain?
Will moz automatically track my new subdomain or do I need to set up a new campaign for it?
Moz Pro | | SamCUK0 -
Campaign Crawl
I have a site with 8036 pages in my sitemap index. But the MozBot only Crawled 2169 pages. It's been several months and each week it crawls roughly the same number of pages. Any idea why I'm not getting fully crawled?
Moz Pro | | JMFieldMarketing0 -
Google analytics account not shown
At the point where I'm asked to connect my Google Analytics account, it only shows three of my accounts, and not the one I'm trying to connect.
Moz Pro | | skittish430 -
How do you assess how strong the page is at number one already?
Would it be a question of trawling their site for their keywords and content? Are there tools that find their weaknesses? Does a campaign against a top ranking site simply come down to cost and writing content that competes for the keywords - if they already have strong on and off page, link building and multiplicity in place?
Moz Pro | | dseo2410 -
Problem with our connection to your Google Analytics
It is the second time in less then a month when i get that error message. The first time I just removed the connection and re-added it. Is there any other way to fix it? is it the problem on my end or is it on SEOMOZ side?
Moz Pro | | SirMax0 -
Crawl Diagnostics Report Lacks Information
When I look at the crawl diagnostics, SEOMoz tells me there are 404 errors. This is understandable, because some pages were removed. What this report doesn't tell me is how those pages were discovered. This is a very important piece of information, because it would tell me there are links pointing to those pages, either internal or external. I believe the internal links have been removed. If the report told me how if found the link, I would be able to take immediate action. Without that information, I have to go so a lot of investigation. And when you have a million pages, that isn't easy. Some possibilities: The crawler remembered the page from the previous crawl. There was a link from an index page - i.e. it is in the database still There was an individual link from another story - so now there are broken links Ditto, but it in on a static index page The link was from an external source - I need to make a redirect Am I missing something, or is this a feature the SEO Moz crawler doesn't have yet? What can I do (other than check all my pages) to discover this?
Moz Pro | | loopyal0