Why might Google be crawling via old sitemap, when the new one has been submitted and verified?
-
We have recently relaunched Scoutzie.com and re-submitted our new sitemap to Google. When I look on Webmaster tools, our new sitemap has been submitted just fine, but at the same time, Google is finding a lot of 404s when crawling the site. My understanding, it is still using crawling the old links, which do not exists. How can I tell Google to refresh it's index and to stop looking at all the old links?
-
Yes it should. However, as Alan mentioned below, if you still have links pointing to the 404 pages, Google will always attempt to crawl them, and will keep you informed that you have errors.
If you do have external links to those 404 pages, you can 301 redirect them to an appropriate page using .htaccess. This way you'll keep the link value and also get rid of the Webmaster Tools error.
If you don't have any links to them, then yes, Google will eventually stop trying to crawl them.
-
It's very likely that we do. Given that I cannot track down a 1000+ links that now 404, will they eventually fall out by themselves, or do I have to tell Google that everything that's 404'ed should be dropped from crawl index? Thanks!
-
What if I simply pushed the new sitemap over the old one? In other words, scoutzie.com/sitemap is the same link, except now it contains the new map. That should be okay, right?
-
you may still have links pointing to those 404 pages on your site or externally. If not then eventually they will fall out of the index
-
Hey scoutzie,
This is actually covered pretty well in Joe Robison's blog post on fixing Webmaster Tools crawl errors: http://moz.com/blog/how-to-fix-crawl-errors-in-google-webmaster-tools
I'll quote the related info:
"One frustrating thing that Google does is it will continually crawl old sitemaps that you have since deleted to check that the sitemap and URLs are in fact dead. If you have an old sitemap that you have removed from Webmaster Tools, and you don’t want being crawled, make sure you let that sitemap 404 and that you are not redirecting the sitemap to your current sitemap."
Hope this helps, good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google has a penalty on my website? How to resolve?
I recently purchases Moz pro, hoping to find out why my website isn't ranking on specific keywords. I'm done analytics on myself and my competition and it seems that I rank better than them on 95% of the categories (using OpenSiteExplorer and keyword research tools). So I purchases Moz Pro hoping I'd find the glaring error in my ways. I found a few small issues I was having, and have dealt with them. Overall however, they were small issues, and have led me to believe Google as a penalty in place for my website (SEO friend have also suspected this). In my Google Webmaster tools, nothing is marked under Manual Actions. So does anyone have a way to contact Google and have them directly look at my website to insure there is no errors in their system?
Moz Pro | | ZSuttonPhoto10 -
Crawlers crawl weird long urls
I did a crawl start for the first time and i get many errors, but the weird fact is that the crawler tracks duplicate long, not existing urls. For example (to be clear): there is a page: www.website.com/dogs/dog.html but then it is continuing crawling:
Moz Pro | | r.nijkamp
www.website.com/dogs/dog.html
www.website.com/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dog.html
www.website.com/dogs/dogs/dogs/dogs/dogs/dog.html what can I do about this? Screaming Frog gave me the same issue, so I know it's something with my website0 -
SEOMoz ranking reports inaccurate for Google?
So I have notice that, at least for some searches, the rankings shown in SEOMoz's ranking reports are meaningless. I assume this is due to blended search results including local search. For example, I have a client, who is ranked 3rd overall for one of his most important search terms, but his ranking is based upon his local result (there are 2 organic search results and then he is the first local result). The SEOMoz report shows him being ranked 12th. Anyway I count down to the 12th ranked site (including local search, not including local search) his site is not there. In fact the only place it is in the top 3 pages is in the local result. As a local marketing consultant, almost all of my clients are looking to be found for "Jackson Hole" this or that, or "Jackson, WY" this or that, so this is a pretty critical issue to me. I would appreciate feedback. Thanks!
Moz Pro | | farlandlee0 -
Google Hiding Indexed Pages from SERPS?
Trying to troubleshoot an issue with one of our websites and noticed a weird discrepancy. Our site should only have 3 pages in the index. The main landing page with a contact form and two policy pages, yet google reports over 1,100 pages (that part is not a mystery, I know where they are coming from.....multi site installations of popular CMS's leave much to be desired in actually separating websites) Here is a screen shot showing the results of the site command: http://www.diigo.com/item/image/2jing/oseh I have set my search settings to show 100 (the max number of results) results per page. Everything is fine until I get to page three where I get the standard "In order to show you the most relevant results, we have omitted some entries very similar to the 122 already displayed." But wait a second, I clicked on page three, now there are only two pages of results and the number of results reported has dropped to 122 http://www.diigo.com/item/image/2jing/r8c9 When I click on the "show omitted results" I do get some more results, and the returned results jumps back up to 1,100. However I only get three pages of results. And when I click on the last page the number of results returned changes to 205 http://www.diigo.com/item/image/2jing/jd4h Is this a difference between indexes (same thing happens when I turn instant search back on, Shows over 1,100 results but when I get to the last page of results it changes to 205). Any other way of getting this info? I am trying to go in and identify how these pages are being generated, but I have to know what ones are showing up in the index for that to happen. Only being able to access 1/5th of the pages indexed is not cool. Anyone have any idea about this or experience with it? For reference I was going through with SEOmoz's excellent toolbar and exporting the results to csv (using the Mozilla plugin). I guess google doesn't like people doing that so maybe this is a way to protect against scraping by only showing limited results in the Site: command. Thanks!
Moz Pro | | prima-2535090 -
Crawl Test produced only 1 page
Hi, I recently submitted a crawl for www.cirrato.com using SEOMoz Crawl Test Tool. I have a lot of pages, but the crawl result shows only 1 page, which is the front page and nothing else... Does anyone know what this could mean or what the problem is?
Moz Pro | | yusufcirrato0 -
Crawl Diagnostics Report Lacks Information
When I look at the crawl diagnostics, SEOMoz tells me there are 404 errors. This is understandable, because some pages were removed. What this report doesn't tell me is how those pages were discovered. This is a very important piece of information, because it would tell me there are links pointing to those pages, either internal or external. I believe the internal links have been removed. If the report told me how if found the link, I would be able to take immediate action. Without that information, I have to go so a lot of investigation. And when you have a million pages, that isn't easy. Some possibilities: The crawler remembered the page from the previous crawl. There was a link from an index page - i.e. it is in the database still There was an individual link from another story - so now there are broken links Ditto, but it in on a static index page The link was from an external source - I need to make a redirect Am I missing something, or is this a feature the SEO Moz crawler doesn't have yet? What can I do (other than check all my pages) to discover this?
Moz Pro | | loopyal0 -
Crawl went from a few errors to thousands when I added Blog
I am new here. I recently got the errors from SEOmoz crawl on my site down to just a handful from a couple hundred. So I took the leap and moved my blog to www.mysitename.com/blog (which I see recommended here) and now my errors are in the thousands. My blog which was a separate url has pages back to 2007. I am not sure if it is appropriate to post my site url in a question here? One error that really stands out is this: Description <dd>Using rel=canonical suggests to search engines which URL should be seen as canonical.</dd> On my root page I have: rel="canonical" href="http://www.mysitename.com"/> Thanks for any help...
Moz Pro | | CMCD0 -
How long does the seomoz crawl take?
It's been doing it's thing for over 48 hours now and Ive got less than 350 pages... is this norma? It's NOT the first crawl.
Moz Pro | | borderbound0