Best strategy to handle over 100,000 404 errors.
-
I recently been given a site that has over one-hundred thousand 404 error codes listed in Google Webmasters.
It is really odd because according to Google Webmasters, the pages that are linking to these 404 pages are also pages that no longer exist (they are 404 pages themselves).
These errors were a result of site migration that had occurred.
Appreciate any input on how one might go about auditing and repairing large amounts of 404 errors.
Thank you.
-
This is a pretty thorough outline of what you need to do: http://moz.com/blog/web-site-migration-guide-tips-for-seos
My steps are usually:
- Identify pages that get significant organic traffic by pulling the Organic Traffic report in Google Analytics for the past year or so.
- Identify pages that have a significant number of links (or, have links from high traffic sources) in Open Site Explorer.
- Map where that content should be now, and 301 redirect to new pages.
- Completely remove all old pages from the index by 404ing them and making sure that no links on new pages point to old pages.
Sounds quick and simple, but this definitely takes time. Good luck!
-
Kristina - thanks for the feedback.
By any chance, would you have a site migration guideline that you recommend?
-
There really isn't a problem with having 100,000 404 "errors." Google's telling you that it thinks 100,000 pages exist, but when it tries to find them, it's getting a 404 code. That's fine: 404s tell Google that a page doesn't exist and to remove the page from Google's index. That's what we want.
The real problem is with your site migration, as FCBM pointed out. If you properly 301 redirect old pages to new, Google will be redirected to the new page, it won't just hit a 404. If you fix the problems with the site migration (not focusing on Google too much), the 404 errors will naturally subside.
The other option is to just take the hit from the migration, and Google will eventually remove all of these pages from its index and stop reporting on them, as long as there aren't live links pointing to the removed pages.
Good luck!
-
It is a problem with the site migration.
Never the less, I have a site right now with over 100,000 errors dealing with 404.
I'm looking for a game plan on how to deal with this many 404 errors in a time effective way.
Any ideas with type of tools or shortcuts? Has anyone else had to deal with a similar issue?
-
Here's one thought to start the quest. ID if the migration was done correctly.
eg If you had a site that was example.com/mens did the 301 look like newsite.com/mens? If not then you might be having tons of issues with a bad planned migration.
-
The WMT notion helps. Thank you.
The main concern is really timing. Are there any effective ways of going through thousands of 404 pages and finding valuable redirects?
-
404s are not founds which are fine if they are really not found and there isn't a different url to point the original page to. One big issue could be that during the migration the old pages weren't 301'd which would result in tons of 404s.
Go through the 404s and see if they are issues or just relics from old data. Then you can mark in fixed in WMTs.
Hope that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
The best way to do Interstitial (ads)
Hello, I want to ask you guys what's the best way do to Interstitial without penalty?
Technical SEO | | JohnPalmer
and feel free to give me samples from another major websites. Thanks!0 -
When Should I Ignore the Error Crawl Report
I have a handful of pages listed in the Error Crawl Report, but the report isn't actually showing anything wrong with these pages. I am double checking the code on the site and also can't find anything. Should I just move on and ignore the Error Crawl Report for these few pages?
Technical SEO | | ChristinaRadisic0 -
Why Canonical error?
I just got my SEOMOZ run and it says I have a CANONICAL ERROR: Scorpio Earrings - 7mm Stud - Sterling Silver http://www.astrojewelry.com/jewelry/scorpio-the-scorpion-earrings-30502.htm I'm not sure why--I only changed the <title>tag--not the URL.</span></p> <p><span class="truncated sub-url" title="http://www.astrojewelry.com/jewelry/scorpio-the-scorpion-earrings-30502.htm">Why would this generate a canonical error?</span></p> <p><span class="truncated sub-url" title="http://www.astrojewelry.com/jewelry/scorpio-the-scorpion-earrings-30502.htm">Kathleen</span></p> <p><span class="truncated sub-url" title="http://www.astrojewelry.com/jewelry/scorpio-the-scorpion-earrings-30502.htm">astrojewelry.com</span></p> <p> </p> <p> </p></title>
Technical SEO | | spkcp1110 -
Massive Increase in 404 Errors in GWT
Last June, we transitioned our site to the Magento platform. When we did so, we naturally got an increase in 404 errors for URLs that were not redirected (for a variety of reasons: we hadn't carried the product for years, Google no longer got the same string when it did a "search" on the site, etc.). We knew these would be there and were completely fine with them. We also got many 404s due to the way Magento had implemented their site map (putting in products that were not visible to customers, including all the different file paths to get to a product even though we use a flat structure, etc.). These were frustrating but we did custom work on the site map and let Google resolve those many, many 440s on its own. Sure enough, a few months went by and GWT started to clear out the 404s. All the poor, nonexistent links from the site map and missing links from the old site - they started disappearing from the crawl notices and we slowly went from some 20k 404s to 4k 404s. Still a lot, but we were getting there. Then, in the last 2 weeks, all of those links started showing up again in GWT and reporting as 404s. Now we have 38k 404s (way more than ever reported). I confirmed that these bad links are not showing up in our site map or anything and I'm really not sure how Google found these again. I know, in general, these 404s don't hurt our site. But it just seems so odd. Is there any chance Google bots just randomly crawled a big ol' list of outdated links it hadn't tried for awhile? And does anyone have any advice for clearing them out?
Technical SEO | | Marketing.SCG0 -
Remove 404 errors
I've got a site (www.dikelli.com.au) that has some 404 errors. I'm using Dreamweaver to manage the site which was built for me by I can't seem to figure out how to remove the 404 pages as it's not showing up in the directory? How would I fix this up?
Technical SEO | | sterls0 -
How to set Home page for the best effect
My head is spinning with all the confusing possibilities. Does anybody have an easy answer for setting up the home page and its canonical-ishness ie Which gives the best SEO Mojo ? \ \default.aspx \keyword\ \keyword\default.aspx Thanking you in advance for reducing the number of business migranes around the globe.
Technical SEO | | blinkybill0 -
Best XML Sitemap Generator for Mac?
Hi all, Recently moved from PC to Mac when starting a new job. One of the things I'm missing from my PC is G Site Crawler, and I haven't yet found a decent equivalent for the Mac. Can anybody recommend something as good as G Site Crawler for the Mac? I.e. I need the flexibility to exclude by URL parameter etc etc. Cheers everyone, Mark
Technical SEO | | markadoi840 -
Best practices for temporary articles
Hello, I would like to have expert inputs about the best way to manage temporary content? In my case, I've a page (ex : mydomain.com/agenda) where I have listing of temporary article, with a lifetime of 1 month to 6 months for some of them. My articles also have a specific url like for ex : mydomain.com/agenda/12-02-2011/thenameofmyarticle/ As you can guess, I got hundreds of 404 😞 I'm already using canonical tag, should I use a in the listing page? I'm a bit lost here..
Technical SEO | | Alexandre_0