Best strategy to handle over 100,000 404 errors.
-
I recently been given a site that has over one-hundred thousand 404 error codes listed in Google Webmasters.
It is really odd because according to Google Webmasters, the pages that are linking to these 404 pages are also pages that no longer exist (they are 404 pages themselves).
These errors were a result of site migration that had occurred.
Appreciate any input on how one might go about auditing and repairing large amounts of 404 errors.
Thank you.
-
This is a pretty thorough outline of what you need to do: http://moz.com/blog/web-site-migration-guide-tips-for-seos
My steps are usually:
- Identify pages that get significant organic traffic by pulling the Organic Traffic report in Google Analytics for the past year or so.
- Identify pages that have a significant number of links (or, have links from high traffic sources) in Open Site Explorer.
- Map where that content should be now, and 301 redirect to new pages.
- Completely remove all old pages from the index by 404ing them and making sure that no links on new pages point to old pages.
Sounds quick and simple, but this definitely takes time. Good luck!
-
Kristina - thanks for the feedback.
By any chance, would you have a site migration guideline that you recommend?
-
There really isn't a problem with having 100,000 404 "errors." Google's telling you that it thinks 100,000 pages exist, but when it tries to find them, it's getting a 404 code. That's fine: 404s tell Google that a page doesn't exist and to remove the page from Google's index. That's what we want.
The real problem is with your site migration, as FCBM pointed out. If you properly 301 redirect old pages to new, Google will be redirected to the new page, it won't just hit a 404. If you fix the problems with the site migration (not focusing on Google too much), the 404 errors will naturally subside.
The other option is to just take the hit from the migration, and Google will eventually remove all of these pages from its index and stop reporting on them, as long as there aren't live links pointing to the removed pages.
Good luck!
-
It is a problem with the site migration.
Never the less, I have a site right now with over 100,000 errors dealing with 404.
I'm looking for a game plan on how to deal with this many 404 errors in a time effective way.
Any ideas with type of tools or shortcuts? Has anyone else had to deal with a similar issue?
-
Here's one thought to start the quest. ID if the migration was done correctly.
eg If you had a site that was example.com/mens did the 301 look like newsite.com/mens? If not then you might be having tons of issues with a bad planned migration.
-
The WMT notion helps. Thank you.
The main concern is really timing. Are there any effective ways of going through thousands of 404 pages and finding valuable redirects?
-
404s are not founds which are fine if they are really not found and there isn't a different url to point the original page to. One big issue could be that during the migration the old pages weren't 301'd which would result in tons of 404s.
Go through the 404s and see if they are issues or just relics from old data. Then you can mark in fixed in WMTs.
Hope that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hi anyone please help I use this code but now getting 404 error. please help.
#index redirect
Technical SEO | | roynguyen
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
RewriteRule ^index.html$ http://domain.com/ [R=301,L]
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L] hi anyone please help I use this code but now getting 404 error. please help. homepage and service.html page is working, but the rest pages like about.html, servicearea.html, and contact.html is not working showing 404 error. and also when you type this URL. generalapplianceserice.ca/about.html generalapplianceserice.ca/contact.html generalapplianceserice.ca/servicearea.html it automatically remove the .HTML extension and shows 404 error, the pages name in root directory is same. these pages work like generalapplianceservice.ca and generalapplianceservice.ca/services why? i also remove this code again but still same issue.0 -
Where and how much; Schema best practices.
Couple of schema questions: Should I 'only' mark up the contact page, as this has the most information? What about the header and footer, should I tag everything there also? If I do mark up the header, footer, and contact page, I end up with 3 "LocalBusiness" entries in Google testing tool, is that bad?
Technical SEO | | MichaelGregory0 -
How I should fix a lot of 404 issues?
Hi guys, Recently I migrated a site from Drupal to Wordpress. I didn't contemplated the URL's will be different on Wordpress. Actually I have moren than 500 (the site has around 5000 pages) page not found issues on WMT and Moz. It because articles that has accents (spanish site) and symbols had wrong URLs in Drupal and right now in Wordpress that pages are not working.
Technical SEO | | alejandrogm
i.e: http://www.lapelotona.com/noticias/-tata-y-messi%2C-el-primer-saludo-de-los-rosarinos-/ So I want to know if I should fix the URLs and make the 301 redirect to each of them or only fix the issues and thats it. Thanks in advance if someone can help me.
Sorry if I have a mistake with my english, I still learning.0 -
Should we handle this redirect differently?
So our question is should we handle page redirection/rewriting in php or in .htaccess (with a specific problem we are running into outlined below). We have an ecommerce store in a subfolder of our site (example.com/store/). In the next folder down we have a group of widgets(www.example.com/store/widget-group1). Recently we put a .htaccess redirect in the top level folder (example.com/store/.htaccess), in order to re-write some URL’s and also 301 a page to another page. This seems to be negatively affecting our /widgets-group1/ subfolder however (organic traffic to example.com/store/widget-group1) took a nose dive 3 days after putting the .htaccess redirect in place on the /store/ folder and it has not recovered 8 days later). *Nothing appears outwardly wrong with the current setup to the eye when viewing the pages or requesting as googlebot (the only issue being the nose dive in organic traffic lol) *both subfolders are setup in apache config file to allow local overrides of .htaccess as follows: <directory store="" widget-group1="">Options -Indexes FollowSymLinks -MultiViews
Technical SEO | | altecdesign
AllowOverride All
Order allow,deny
allow from all</directory> <directory store="">Options -Indexes FollowSymLinks -MultiViews
AllowOverride All
Order allow,deny
allow from all</directory>0 -
Get rid of a large amount of 404 errors
Hi all, The problem:Google pointed out to me that I have a large increase of 404 errors. In short I had software before that created pages (automated) for long tale search terms and feeded them to google. Recently I quit this service and all those pages (about 500000) were deleted. Now google GWM points out about 800000 404 errors. What I noticed: I had a large amount of 404's before when I changed my website. I fixed it (proper 302) and as soon as all the 404's in GWM were gone I had around 200 visitors a day more. It seems that a clean site is better positioned. Anybody any suggestion on how to tell google that all urls starting with www.domain/webdir/ should be deleted from cache?
Technical SEO | | hometextileshop0 -
WP Blog Errors
My WP blog is adding my email during the crawl, and I am getting 200+ errors for similar to the following; http://www.cisaz.com/blog/2010/10-reasons-why-microsofts-internet-explorer-dominance-is-ending/tony@cisaz.net "tony@cisaz.net" is added to Every post. Any ideas how I fix it? I am using Yoast Plug in. Thanks Guys!
Technical SEO | | smstv0 -
Best cross linking strategy for micro sites?
Hi Guys. I created a micro site (A design showcase gallery) away from the main website to attract a lot of links in my space from competitors. It works so well it has become a valuable resource in my industry and I believe I will keep it running and adding content to it. Is the best SEO strategy for the main site simply to link from each page to the main site? Or should I be looking at something else? Thanks, Alan
Technical SEO | | spoiltchild0 -
Best way to Handle Pagination?
At the moment I my blog is paginated like so: /blogs > /blogs/page/2 > /blogs/page/3 etc What are the benefits of paginating with dynamic URLs like here on SEOmoz with /blog?page=3
Technical SEO | | NickPateman810