Best strategy to handle over 100,000 404 errors.
-
I recently been given a site that has over one-hundred thousand 404 error codes listed in Google Webmasters.
It is really odd because according to Google Webmasters, the pages that are linking to these 404 pages are also pages that no longer exist (they are 404 pages themselves).
These errors were a result of site migration that had occurred.
Appreciate any input on how one might go about auditing and repairing large amounts of 404 errors.
Thank you.
-
This is a pretty thorough outline of what you need to do: http://moz.com/blog/web-site-migration-guide-tips-for-seos
My steps are usually:
- Identify pages that get significant organic traffic by pulling the Organic Traffic report in Google Analytics for the past year or so.
- Identify pages that have a significant number of links (or, have links from high traffic sources) in Open Site Explorer.
- Map where that content should be now, and 301 redirect to new pages.
- Completely remove all old pages from the index by 404ing them and making sure that no links on new pages point to old pages.
Sounds quick and simple, but this definitely takes time. Good luck!
-
Kristina - thanks for the feedback.
By any chance, would you have a site migration guideline that you recommend?
-
There really isn't a problem with having 100,000 404 "errors." Google's telling you that it thinks 100,000 pages exist, but when it tries to find them, it's getting a 404 code. That's fine: 404s tell Google that a page doesn't exist and to remove the page from Google's index. That's what we want.
The real problem is with your site migration, as FCBM pointed out. If you properly 301 redirect old pages to new, Google will be redirected to the new page, it won't just hit a 404. If you fix the problems with the site migration (not focusing on Google too much), the 404 errors will naturally subside.
The other option is to just take the hit from the migration, and Google will eventually remove all of these pages from its index and stop reporting on them, as long as there aren't live links pointing to the removed pages.
Good luck!
-
It is a problem with the site migration.
Never the less, I have a site right now with over 100,000 errors dealing with 404.
I'm looking for a game plan on how to deal with this many 404 errors in a time effective way.
Any ideas with type of tools or shortcuts? Has anyone else had to deal with a similar issue?
-
Here's one thought to start the quest. ID if the migration was done correctly.
eg If you had a site that was example.com/mens did the 301 look like newsite.com/mens? If not then you might be having tons of issues with a bad planned migration.
-
The WMT notion helps. Thank you.
The main concern is really timing. Are there any effective ways of going through thousands of 404 pages and finding valuable redirects?
-
404s are not founds which are fine if they are really not found and there isn't a different url to point the original page to. One big issue could be that during the migration the old pages weren't 301'd which would result in tons of 404s.
Go through the 404s and see if they are issues or just relics from old data. Then you can mark in fixed in WMTs.
Hope that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
GWT giving me 404 errors based on old and deleted site map
I'm getting a bunch of 404 crawl errors in my Google Webmaster Tools because we just moved our site to a new platform with new URL structure. We 301 redirected all the relevant pages. We submitted a new site map and then deleted all the site maps to the old website url structure. However, google keeps crawling the OLD urls and reporting back the 404 errors. It says that the website is linking to these 404 pages via an old outdated sitemap (which if you goto shows a 404 as well, so it's not as if Google is reading these old site maps now). Instead it's as if Google has cached the old sitemap but continues to use it to crawl these non-existent pages. Any thoughts?
Technical SEO | | Santaur0 -
1,300,000 404s
Just moved a WordProcess site over to a new host and skinned it. Found out after the fact that the site had been hacked - the db is clean. I did notice at first there were a lot of 404s being generated, so I setup a script to capture and then return a 410 page gone - and then the plan was to submit them to have them removed from the index - thinking there was a manageable number But, when I looked at Google WebMaster Tools there was over 1,300,000 404 errors - see attachment. My puny attempt to solve this problem seems to need more of an industrial size solution. My question, is that what would be the best way to deal with this? Not all of the pages are indexed in google - only 637 index but you can only see about 150 in the index. Where bing is another story saying that over 2,700 pages index but only can see about 200. How is this affecting any future rankings - they do not rank well, as I found out because of very slow page load speed and of course the hacks? The link profile looking at Google is OK, and there are no messages in Google Webmaster tools. am5cMz2
Technical SEO | | Runner20090 -
Best URL format for pagination
We're currently changing the URL format of our website search, we have been discussing a lot and cannot decide the past way to pass the pagination parameter for SEO. We narrowed down to the options. www.website.com/apples/p2 - www.website.com/apples?page=2 - www.website.com/apples/page/2 What would give us best ranking returns? What do you think?
Technical SEO | | HelpSaude0 -
Pages Linking to Sites that Return 404 Error
We have just a few 404 errors on our site. Is there any way to figure out which pages are linking to the pages that create 404 errors? I would rather fix the links than create new 301 redirects. Thanks!
Technical SEO | | jsillay0 -
Client error 404
I have got a lot (100+) of 404´s. I got more the last time, so I rearranged the whole site. I even changed it from .php to .html. I have went to the web hotel to delete all of the .php files from the main server. Still, I got after yesterdays crawl 404´s on my (deleted) .php sites. There is also other links that has an error, but aren't there. Maybe those pages were there before the sites remodelling, but I don't think so because .html sites is also affected. How can this be happening?
Technical SEO | | mato0 -
301 Link Authority 100% Transfer
Our web designer wants to change our pages from the .htm to .html extension. Would 301 redirects transfer our link juice to rankings are not harmed?
Technical SEO | | TheDude0 -
150 Duplicate page error
I am told that I have 150 duplicate page content. It seems that it is the login link on each of my pages. Is this an error? Is it something I have to change? Thanks Login/Register at http://irishdancingdress.com/wp-login.php?redirect_to=http%3A%2F%2Firishdancingdress.com%2Fdress
Technical SEO | | ukkpower0 -
Crawl Errors In Webmaster Tools
Hi Guys, Searched the web in an answer to the importance of crawl errors in Webmaster tools but keep coming up with different answers. I have been working on a clients site for the last two months and (just completed one months of link bulding), however seems I have inherited issues I wasn't aware of from the previous guy that did the site. The site is currently at page 6 for the keyphrase 'boiler spares' with a keyword rich domain and a good onpage plan. Over the last couple of weeks he has been as high as page 4, only to be pushed back to page 8 and now settled at page 6. The only issue I can seem to find with the site in webmaster tools is crawl errors here are the stats:- In sitemaps : 123 Not Found : 2,079 Restricted by robots.txt 1 Unreachable: 2 I have read that ecommerce sites can often give off false negatives in terms of crawl errors from Google, however, these not found crawl errors are being linked from pages within the site. How have others solved the issue of crawl errors on ecommerce sites? could this be the reason for the bouncing round in the rankings or is it just a competitive niche and I need to be patient? Kind Regards Neil
Technical SEO | | optimiz10