Massive Increase in 404 Errors in GWT
-
Last June, we transitioned our site to the Magento platform. When we did so, we naturally got an increase in 404 errors for URLs that were not redirected (for a variety of reasons: we hadn't carried the product for years, Google no longer got the same string when it did a "search" on the site, etc.). We knew these would be there and were completely fine with them.
We also got many 404s due to the way Magento had implemented their site map (putting in products that were not visible to customers, including all the different file paths to get to a product even though we use a flat structure, etc.). These were frustrating but we did custom work on the site map and let Google resolve those many, many 440s on its own.
Sure enough, a few months went by and GWT started to clear out the 404s. All the poor, nonexistent links from the site map and missing links from the old site - they started disappearing from the crawl notices and we slowly went from some 20k 404s to 4k 404s. Still a lot, but we were getting there.
Then, in the last 2 weeks, all of those links started showing up again in GWT and reporting as 404s. Now we have 38k 404s (way more than ever reported). I confirmed that these bad links are not showing up in our site map or anything and I'm really not sure how Google found these again.
I know, in general, these 404s don't hurt our site. But it just seems so odd. Is there any chance Google bots just randomly crawled a big ol' list of outdated links it hadn't tried for awhile? And does anyone have any advice for clearing them out?
-
I'm just cynical enough to suspect this may be a byproduct of Google Webmaster Tools recent inbound link meltdown. Huge numbers of GWT users are reporting that their inbound link reports have basically lost most of their links.
What if, in dealing with the problem, Google has gone back to an older version of the links database, which might recover more of the recent links, but also pull back a whack of those links it already discounted?
This is pure speculation on my part, but there's been so much volatility on Google's link reporting recently that I can't say I trust the data as far as I can toss it at the moment.
Have you tired a similar comparison to the data shown in Bing Webmaster Tools?
I'm sure I read of others encountering what you're talking about recently. Will see if I can find the references in case they found anything.
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
60,000 404 errors
Do 404 errors on a large scale really matter? I'm just aware that I now have over 60,000 and was wondering if the community think that I should address them by putting 301 redirects in place. Thanks
Technical SEO | | the-gate-films0 -
Increase in pages crawled per day
What does it mean when GWT abruptly jump from 15k to 30k pages crawled per day? I am used to see spikes, like 10k average and a couple of time per month 50k pages crawled. But in this case 10 days ago moved from 15k to 30k per day and it's staying there. I know it's a good sign, the crawler is crawling more pages per day, so it's picking up changes more often, but I have no idea of why is doing it, what good signals usually drive google crawler to choose to increase the number of pages crawled per day? Anyone knows?
Technical SEO | | max.favilli1 -
Find all 404 links in my site that are indexed
Hi All, Find all 404 links in my site that are indexed. We deleted a lot of URl's from site but now i dont have the track of all we deleted. Any site/Tool can scan the index and give me the exact URL's so I can use https://www.google.com/webmasters/tools/removals?hl=en&rlf=all Regards Martin
Technical SEO | | mtthompsons0 -
405 HTTP Status instead of 404
Hi We need to block some www1-pages from being indexed. Now IT has resolved this but pages like http://www1.swisscom.ch/fr/business/pme.html return a 405 status instead of a 404. The pages are currently still indexed in Google. Must the status be changed to 404 or should I just wait and see if Google de-indexes them anyhow?
Technical SEO | | zeepartner0 -
Get rid of a large amount of 404 errors
Hi all, The problem:Google pointed out to me that I have a large increase of 404 errors. In short I had software before that created pages (automated) for long tale search terms and feeded them to google. Recently I quit this service and all those pages (about 500000) were deleted. Now google GWM points out about 800000 404 errors. What I noticed: I had a large amount of 404's before when I changed my website. I fixed it (proper 302) and as soon as all the 404's in GWM were gone I had around 200 visitors a day more. It seems that a clean site is better positioned. Anybody any suggestion on how to tell google that all urls starting with www.domain/webdir/ should be deleted from cache?
Technical SEO | | hometextileshop0 -
406 errors
Just started seeing 406 errors on our last crawl (all jpg related). Seomoz found 670 of these on my site when there were 0 before. I have checked the MIME and everything seems to be in the right order. So could it be that Seomoz-crawler is showing errors that aren't really errors?
Technical SEO | | smines0 -
301 on ALL 404 for Pagerank Recovery, bad idea?
Its a bad idea to do a 301 to the home page of the site on all 404 pages as a way of pagerank recovery? If so, why? Thanks,
Technical SEO | | Bligoo0 -
Hundreds of 404 Pages, What Should I do?
Hi, My client just had there website redeveloped within wordpress. I just ran a crawl errors test for their website using Google Webmasters. I discovered that the client has about six hundred, 404 pages. Most of the error pages originated from their previous image gallery. I already have a custom 404 page set-up, but is there something else I should be doing? Is it worth while to 301 redirect every single page within the .htaccess file, or will Google filter these pages out of its index naturally? Thanks Mozers!
Technical SEO | | calindaniel0