Massive Increase in 404 Errors in GWT
-
Last June, we transitioned our site to the Magento platform. When we did so, we naturally got an increase in 404 errors for URLs that were not redirected (for a variety of reasons: we hadn't carried the product for years, Google no longer got the same string when it did a "search" on the site, etc.). We knew these would be there and were completely fine with them.
We also got many 404s due to the way Magento had implemented their site map (putting in products that were not visible to customers, including all the different file paths to get to a product even though we use a flat structure, etc.). These were frustrating but we did custom work on the site map and let Google resolve those many, many 440s on its own.
Sure enough, a few months went by and GWT started to clear out the 404s. All the poor, nonexistent links from the site map and missing links from the old site - they started disappearing from the crawl notices and we slowly went from some 20k 404s to 4k 404s. Still a lot, but we were getting there.
Then, in the last 2 weeks, all of those links started showing up again in GWT and reporting as 404s. Now we have 38k 404s (way more than ever reported). I confirmed that these bad links are not showing up in our site map or anything and I'm really not sure how Google found these again.
I know, in general, these 404s don't hurt our site. But it just seems so odd. Is there any chance Google bots just randomly crawled a big ol' list of outdated links it hadn't tried for awhile? And does anyone have any advice for clearing them out?
-
I'm just cynical enough to suspect this may be a byproduct of Google Webmaster Tools recent inbound link meltdown. Huge numbers of GWT users are reporting that their inbound link reports have basically lost most of their links.
What if, in dealing with the problem, Google has gone back to an older version of the links database, which might recover more of the recent links, but also pull back a whack of those links it already discounted?
This is pure speculation on my part, but there's been so much volatility on Google's link reporting recently that I can't say I trust the data as far as I can toss it at the moment.
Have you tired a similar comparison to the data shown in Bing Webmaster Tools?
I'm sure I read of others encountering what you're talking about recently. Will see if I can find the references in case they found anything.
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My Website Page Speed is not increasing
HEY EXPERTS, My website page speed is not increasing. I used the wp rocket plugin but still, I am facing errors of Reduce unused CSS, Properly size images, and Avoid serving legacy JavaScript to modern browsers. you can see in the image Screenshot (7).png I used many plugins for speed optimization but still facing errors. I optimized the images manually by using photoshop but still, I am facing the issue of images size. After Google Core Web Vital Update my website keyword position is down due to slow speed. Please guide me on how I increase the page speed of my website https://karmanwalayfabrics.pk Thanks
Technical SEO | | frazashfaq110 -
Can increase in crawl errors in GWT) be caused by input fields and jquery?
Dear Mozzerz We took over www.urgiganten.dk not long ago and last week we opened up for indexation, after having taken the old website down for a couple of months. One week after opening for indexation we saw a huge increase in crawl errors.Google is discovering some weird links to e.g http://www.urgiganten.dk/30-garmin-urremme/ which returns a 404. In GWT we are told that we are linking to this url from http://www.urgiganten.dk/garmin-urremme. But nowhere on http://www.urgiganten.dk/garmin-urremme will you find this link. However you will find the following script in the source code, which is the only code part that contains "/30-garmin-urremme/":Can it be true that google take the id and adds it to our tld to form a url? We have seen quite a lot of these errors not only on Urgiganten.dk but also some of our other websites!
Technical SEO | | urgiganten0 -
Moz showing 404 error on one of my sites
I have a problem. Everything seems to be ok, but moz shows a HTTP code of 404 for http://www.centralevapeurguide.com and I don't really know why. All my others websites return 200 but this one return 404. And obviously, only this website don't want to rank in google.. Thanks for your help. Sebastian
Technical SEO | | sebagorka0 -
Drastic increase of indexed pages correlated to rankings loss?
Our ecommerce website has had a drastic increase in indexed pages, and equal loss of Google organic traffic. After 10/1 the number of indexed pages jumped from 240k to 5.7 million by the end of the year, according to GWT. Coincidentally, the sitemap tops at 14,192 pages, with 13,324 indexed. Organic traffic on some top keyphrases began declining by half after 10/26 and ranking (previously placing in the top 5 spots) has dropped to the fifth page of results. This website does produce session id's (/c=) so we been blocking /c=/ in the robots.txt file. We also have a rel=canonical on all pages pointing at the correct url. With all of this in place, traffic hasn't recovered. Is there a correlation between this spike of indexed pages and the lost keyword ranking? Any advice to investigate and correct this further would be greatly appreciated. Thanks.
Technical SEO | | marketing_zoovy.com0 -
How to fix duplicate page content error?
SEOmoz's Crawl Diagnostics is complaining about a duplicate page error. The example of links that has duplicate page content error are http://www.equipnet.com/misc-spare-motors-and-pumps_listid_348855 http://www.equipnet.com/misc-spare-motors-and-pumps_listid_348852 These are not duplicate pages. There are some values that are different on both pages like listing # , equipnet tag # , price. I am not sure how do highlight the different things the two page has like the "Equipment Tag # and listing #". Do they resolve if i use some style attribute to highlight such values on page? Please help me with this as i am not really sure why seo is thinking that both pages have same content. Thanks !!!
Technical SEO | | RGEQUIPNET0 -
Why would SEOMoz and GWT report 404 errors for pages that are not 404ing?
Recently, I've noticed that nearly all of the 404 errors (not soft 404) reported in GWT actually resolve to a legitimate page. This was weird, but I thought it might just be old info, so I would go through the process of checking and "mark as fixed" as necessary. However, I noticed that SEOMoz is picking up on these 404 errors in the diagnostics of the site as well, and now I'm concerned with what the problem could be. Anyone have any insight into this? Rich
Technical SEO | | secretstache0 -
404 crawl errors from "tel:" link?
I am seeing thousands of 404 errors. Each of the urls is like this: abc.com/abc123/tel:1231231234 Everything is normal about that url except the "/tel:1231231234" these urls are bad with the tel: extension, they are good without it. The only place I can find this character string is on each page we have this code which is used for Iphones and such. What are we doing wrong? Code: Phone: <a href="[tel:1231231234](tel:7858411943)"> (123) 123-1234a>
Technical SEO | | EugeneF0 -
404 help
Hello all, firstly let me apologize if this is the wrong place to ask this question. I have a site www.promptresponseaccidentmanagement.com which gets a 200ok when checked for crawl issues, however pages such as /whiplash-injury-compensation-claims.php , /road-traffic-accident-compensation-claims.php and quite a few more return a 404. That's fine (usually) as I can quite happily fix that most of the time. However if you actually go to those pages in your browser, or click through to them on any part of the site you will see that they are in fact not redirecting to a 404 and everything is fine!? Any body got any ideas? Best H
Technical SEO | | haydyn0