Best strategy to handle over 100,000 404 errors.
-
I recently been given a site that has over one-hundred thousand 404 error codes listed in Google Webmasters.
It is really odd because according to Google Webmasters, the pages that are linking to these 404 pages are also pages that no longer exist (they are 404 pages themselves).
These errors were a result of site migration that had occurred.
Appreciate any input on how one might go about auditing and repairing large amounts of 404 errors.
Thank you.
-
This is a pretty thorough outline of what you need to do: http://moz.com/blog/web-site-migration-guide-tips-for-seos
My steps are usually:
- Identify pages that get significant organic traffic by pulling the Organic Traffic report in Google Analytics for the past year or so.
- Identify pages that have a significant number of links (or, have links from high traffic sources) in Open Site Explorer.
- Map where that content should be now, and 301 redirect to new pages.
- Completely remove all old pages from the index by 404ing them and making sure that no links on new pages point to old pages.
Sounds quick and simple, but this definitely takes time. Good luck!
-
Kristina - thanks for the feedback.
By any chance, would you have a site migration guideline that you recommend?
-
There really isn't a problem with having 100,000 404 "errors." Google's telling you that it thinks 100,000 pages exist, but when it tries to find them, it's getting a 404 code. That's fine: 404s tell Google that a page doesn't exist and to remove the page from Google's index. That's what we want.
The real problem is with your site migration, as FCBM pointed out. If you properly 301 redirect old pages to new, Google will be redirected to the new page, it won't just hit a 404. If you fix the problems with the site migration (not focusing on Google too much), the 404 errors will naturally subside.
The other option is to just take the hit from the migration, and Google will eventually remove all of these pages from its index and stop reporting on them, as long as there aren't live links pointing to the removed pages.
Good luck!
-
It is a problem with the site migration.
Never the less, I have a site right now with over 100,000 errors dealing with 404.
I'm looking for a game plan on how to deal with this many 404 errors in a time effective way.
Any ideas with type of tools or shortcuts? Has anyone else had to deal with a similar issue?
-
Here's one thought to start the quest. ID if the migration was done correctly.
eg If you had a site that was example.com/mens did the 301 look like newsite.com/mens? If not then you might be having tons of issues with a bad planned migration.
-
The WMT notion helps. Thank you.
The main concern is really timing. Are there any effective ways of going through thousands of 404 pages and finding valuable redirects?
-
404s are not founds which are fine if they are really not found and there isn't a different url to point the original page to. One big issue could be that during the migration the old pages weren't 301'd which would result in tons of 404s.
Go through the 404s and see if they are issues or just relics from old data. Then you can mark in fixed in WMTs.
Hope that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404 errors
Hi I am getting these show up in WMT crawl error any help would be very much appreciated | ?escaped_fragment=Meditation-find-peace-within/csso/55991bd90cf2efdf74ec3f60 | 404 | 12/5/15 |
Technical SEO | | ReSEOlve
| | 2 | mobile/?escaped_fragment= | 404 | 10/26/15 |
| | 3 | ?escaped_fragment=Tips-for-a-balanced-lifestyle/csso/1 | 404 | 12/1/15 |
| | 4 | ?escaped_fragment=My-favorite-yoga-spot/csso/5598e2130cf2585ebcde3b9a | 404 | 12/1/15 |
| | 5 | ?escaped_fragment=blog/c19s6 | 404 | 11/29/15 |
| | 6 | ?escaped_fragment=blog/c19s6/Tag/yoga | 404 | 11/30/15 |
| | 7 | ?escaped_fragment=Inhale-exhale-and-once-again/csso/2 | 404 | 11/27/15 |
| | 8 | ?escaped_fragment=classes/covl | 404 | 10/29/15 |
| | 9 | m/?escaped_fragment= | 404 | 10/26/15 |
| | 10 | ?escaped_fragment=blog/c19s6/Page/1 | 404 | 11/30/15 | | |0 -
Homepage De-Indexed - No Errors, No Warnings
Hi, I am currently working on this project. Sometime between March 7th & 8th homepage was de-indexed. The rest of the pages are there. Found it out through decreased traffic on GA. No notifications of any kind of penalty/errors recieved. Tried to manually re-index through "Fetch as Google" in WMT to no avail. Site is redirected to https. Any suggestions would be highly appreciated. Thank you in advance.
Technical SEO | | gpapatheodorou0 -
Wordpress 404 Errors
Hi Guys, One of my clients is scratching his head after a site migration. He has moved to wordpress and now GWT is creating weird and wonderful strange 404 errors. For example http://www.allsee-tech.com/digital-signage-blog/category/clients.html There are loads like the above which seem to be made up out of his blog and navigation http://www.allsee-tech.com/clients.html works! Any ideas? Is it a rogue plugin? How do we fix? Kind Regards Neil
Technical SEO | | nezona0 -
SEO best practice : HTTP to HTTPS
What's the best practice to switch from an all HTTP site to an all HTTPS site ?
Technical SEO | | Crocodesign
No changes to the site structure, just a full site switch to SSL.
Right now, the site is reachable with HTTP and with HTTPS. http://crocodesign.be --> https://crocodesign.be
http://www.crocodesign.be --> https://crocodesign.be
https://www.crocodesign.be --> https://crocodesign.be CMS : Wordpress 3.9
Server type : Apache
Preferred method : .htaccess0 -
Sitemap issue - Tons of 404 errors
We've recreated a client site in a subdirectory (mysite.com/newsite) of his domain and when it was ready to go live, added code to the htaccess file in order to display the revamped website on the main url. These are the directions that were followed to do this: http://codex.wordpress.org/Giving_WordPress_Its_Own_Directory and http://codex.wordpress.org/Moving_WordPress#When_Your_Domain_Name_or_URLs_Change. This has worked perfectly except that we are now receiving a lot of 404 errors am I'm wondering if this isn't the root of our evil. This is a WordPress self-hosted website and we are actively using the WordPress SEO plugin that creates multiple folders with only 50 links in each. The sitemap_index.xml file tests well in Google Analytics but is pulling a number of links from the subdirectory folder. I'm wondering if it really is the manner in which we made the site live that is our issue or if there is another problem that I cannot see yet. What is the best way to attack this issue? Any clues? The site in question is www.atozqualityfencing.com https://wordpress.org/plugins/wordpress-seo/
Technical SEO | | JanetJ0 -
301ing 404's
Hey guys, I am currently in the process of redirecting some of my 404 pages to pages like my home page. Before I do that, I am assessing the link value of the 404 pages. My question is what do you do with the 404 pages which appear to have low quality links, do you really want to redirect them to an important page on your site? What should I do with these 404 pages? CheersAdam
Technical SEO | | Adamshowbiz0 -
Which are the best website reporting tools for speed and errors
Hi, i have just made some changes on my site by adding some redirects and i want to know what affect this has had on my site www.in2town.co.uk I have been using tools such as Pingdom tools, http://gtmetrix.com but these always give me different time reports so i do not know the true speed of my site and give me different advice. So i am wanting to know how to check the true speed for my site in the UK and how to check for the errors to make it better any advice would be great on which tools to use
Technical SEO | | ClaireH-1848860 -
REL Canonical Error
In my crawl diagnostics it showing a Rel=Canonical error on almost every page. I'm using wordpress. Is there a default wordpress problem that would cause this?
Technical SEO | | mmaes0