Best strategy to handle over 100,000 404 errors.
-
I recently been given a site that has over one-hundred thousand 404 error codes listed in Google Webmasters.
It is really odd because according to Google Webmasters, the pages that are linking to these 404 pages are also pages that no longer exist (they are 404 pages themselves).
These errors were a result of site migration that had occurred.
Appreciate any input on how one might go about auditing and repairing large amounts of 404 errors.
Thank you.
-
This is a pretty thorough outline of what you need to do: http://moz.com/blog/web-site-migration-guide-tips-for-seos
My steps are usually:
- Identify pages that get significant organic traffic by pulling the Organic Traffic report in Google Analytics for the past year or so.
- Identify pages that have a significant number of links (or, have links from high traffic sources) in Open Site Explorer.
- Map where that content should be now, and 301 redirect to new pages.
- Completely remove all old pages from the index by 404ing them and making sure that no links on new pages point to old pages.
Sounds quick and simple, but this definitely takes time. Good luck!
-
Kristina - thanks for the feedback.
By any chance, would you have a site migration guideline that you recommend?
-
There really isn't a problem with having 100,000 404 "errors." Google's telling you that it thinks 100,000 pages exist, but when it tries to find them, it's getting a 404 code. That's fine: 404s tell Google that a page doesn't exist and to remove the page from Google's index. That's what we want.
The real problem is with your site migration, as FCBM pointed out. If you properly 301 redirect old pages to new, Google will be redirected to the new page, it won't just hit a 404. If you fix the problems with the site migration (not focusing on Google too much), the 404 errors will naturally subside.
The other option is to just take the hit from the migration, and Google will eventually remove all of these pages from its index and stop reporting on them, as long as there aren't live links pointing to the removed pages.
Good luck!
-
It is a problem with the site migration.
Never the less, I have a site right now with over 100,000 errors dealing with 404.
I'm looking for a game plan on how to deal with this many 404 errors in a time effective way.
Any ideas with type of tools or shortcuts? Has anyone else had to deal with a similar issue?
-
Here's one thought to start the quest. ID if the migration was done correctly.
eg If you had a site that was example.com/mens did the 301 look like newsite.com/mens? If not then you might be having tons of issues with a bad planned migration.
-
The WMT notion helps. Thank you.
The main concern is really timing. Are there any effective ways of going through thousands of 404 pages and finding valuable redirects?
-
404s are not founds which are fine if they are really not found and there isn't a different url to point the original page to. One big issue could be that during the migration the old pages weren't 301'd which would result in tons of 404s.
Go through the 404s and see if they are issues or just relics from old data. Then you can mark in fixed in WMTs.
Hope that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the best format for animated content
We want to use some movement in our designs, charts etc. what format is the most SEO friendly?
Technical SEO | | remkoallertz1 -
On our site by mistake some wrong links were entered and google crawled them. We have fixed those links. But they still show up in Not Found Errors. Should we just mark them as fixed? Or what is the best way to deal with them?
Some parameter was not sent. So the link was read as : null/city, null/country instead cityname/city
Technical SEO | | Lybrate06060 -
Schema Markup Errors - Priority or Not?
Greetings All... I've been digging through the search console on a few of my sites and I've been noticing quite a few structured data errors. Most of the errors are related to: hcard, hentry and hatom. Most of them are missing author & entry-title, while the other one is missing: fn. I recently saw an article on SEL about Google's focus on spammy mark-up. The sites I use are built and managed by vendors, so I would have to impress upon them the impact of these errors and have them prioritize, then fix them. My question is whether or not this should be prioritized? Should I have them correct these errors sooner than later or can I take a phased approach? I haven't noticed any loss in traffic or anything like that, I'm more focused on what negative impact a "phased approach" could have. Any thoughts?
Technical SEO | | AfroSEO0 -
Https and 404 code that goes into htaccess
The 404 error code we put into htaccess files for our websites does not work correctly for our https site. We recently changed one of our http sites to https. When we went to create a 404.html page for it by creating an htaccess folder with the 404 error code in it, once we uploaded the file all of our webpages were displaying incorrectly, as if the css was not attached. The 404 code we used works successfully for our other 404.html pages for our other sites (www.telfordinc.com/404.html). However, it does not work for the https site. Below is the 404 error code we are using for our https site (currently not uploaded until pages display correctly) ErrorDocument 404 /404-error.html RewriteEngine on RewriteCond %{HTTP_REFERER} !^$ RewriteCond %{HTTP_REFERER} !^http://(www.)?privatemoneyhardmoneyloan.com/.*$ [NC] RewriteRule .(gif|jpg|js|css)$ - [F] Options +FollowSymLinks RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} !^www.privatemoneyhardmoneyloan.com$ [NC] RewriteRule ^(.*)$ http://www.privatemoneyhardmoneyloan.com/$1 [R=301,L] So we want to know if there is a different 404 error code that goes into the htaccess file for an https vs. http? Appreciate your feedback on this issue
Technical SEO | | Manifestation0 -
Help! Same page with multiple urls. How is this handled?
I'm using DotNetNuke for many of our sites. DotNetNuke's home page can have multiple VALID URLs that go to the same home page. Example: http://aviation-sms.com http://www.aviation-sms.com http://aviation-sms.com/default.aspx http://www.aviation-sms.com/default.aspx and http://aviation-sms.com/aviationSMS.aspx http://www.aviation-sms.com/aviationSMS.aspx All the above URLs have the same content. In the page header tag, I have: Should I be doing something else? such as removing the "default.aspx"??? I have a blog also that has a boatload of pages. I tried this canonical approach, but I'm not sure SEO Moz likes it and the tool offers me little guidance on this issue.
Technical SEO | | manintights280 -
Help with bing redirection error
Can somebody help me figure out this bing redirect error. The link to "http://w******/flea-control" has resulted in HTTP redirection to "http://w******/feas/flea-control/".Search engines can only pass page rankings and other relevant data through a single redirection hop. Using unnecessary redirects can have a negative impact on page ranking. I am using wordpress. I am actually linking to the /feas/flea-control/ version. I have looked every where for help. I got this error using bings seo toftware
Technical SEO | | OxzenMedia0 -
404's and duplicate content.
I have real estate based websites that add new pages when new listings are added to the market and then deletes pages when the property is sold. My concern is that there are a significant amount of 404's created and the listing pages that are added are going to be the same as others in my market who use the same IDX provider. I can go with a different IDX provider that uses IFrame which doesn't create new pages but I used a IFrame before and my time on site was 3min w/ 2.5 pgs per visit and now it's 7.5 pg/visit with 6+min on the site. The new pages create new content daily so is fresh content and better on site metrics (with the 404's) better or less 404's, no dup content and shorter onsite metrics better? Any thoughts on this issue? Any advice would be appreciated
Technical SEO | | AnthonyLasVegas0 -
Duplicate content handling.
Hi all, I have a site that has a great deal of duplicate content because my clients list the same content on a few of my competitors sites. You can see an example of the page here: http://tinyurl.com/62wghs5 As you can see the search results are on the right. A majority of these results will also appear on my competitors sites. My homepage does not seem to want to pass link juice to these pages. Is it because of the high level of Dup Content or is it because of the large amount of links on the page? Would it be better to hide the content from the results in a nofollowed iframe to reduce duplicate contents visibilty while at the same time increasing unique content with articles, guides etc? or can the two exist together on a page and still allow link juice to be passed to the site. My PR is 3 but I can't seem to get any of my internal pages(except a couple of pages that appear in my navigation menu) to budge of the PR0 mark even if they are only one click from the homepage.
Technical SEO | | Mulith0