Best strategy to handle over 100,000 404 errors.
-
I recently been given a site that has over one-hundred thousand 404 error codes listed in Google Webmasters.
It is really odd because according to Google Webmasters, the pages that are linking to these 404 pages are also pages that no longer exist (they are 404 pages themselves).
These errors were a result of site migration that had occurred.
Appreciate any input on how one might go about auditing and repairing large amounts of 404 errors.
Thank you.
-
This is a pretty thorough outline of what you need to do: http://moz.com/blog/web-site-migration-guide-tips-for-seos
My steps are usually:
- Identify pages that get significant organic traffic by pulling the Organic Traffic report in Google Analytics for the past year or so.
- Identify pages that have a significant number of links (or, have links from high traffic sources) in Open Site Explorer.
- Map where that content should be now, and 301 redirect to new pages.
- Completely remove all old pages from the index by 404ing them and making sure that no links on new pages point to old pages.
Sounds quick and simple, but this definitely takes time. Good luck!
-
Kristina - thanks for the feedback.
By any chance, would you have a site migration guideline that you recommend?
-
There really isn't a problem with having 100,000 404 "errors." Google's telling you that it thinks 100,000 pages exist, but when it tries to find them, it's getting a 404 code. That's fine: 404s tell Google that a page doesn't exist and to remove the page from Google's index. That's what we want.
The real problem is with your site migration, as FCBM pointed out. If you properly 301 redirect old pages to new, Google will be redirected to the new page, it won't just hit a 404. If you fix the problems with the site migration (not focusing on Google too much), the 404 errors will naturally subside.
The other option is to just take the hit from the migration, and Google will eventually remove all of these pages from its index and stop reporting on them, as long as there aren't live links pointing to the removed pages.
Good luck!
-
It is a problem with the site migration.
Never the less, I have a site right now with over 100,000 errors dealing with 404.
I'm looking for a game plan on how to deal with this many 404 errors in a time effective way.
Any ideas with type of tools or shortcuts? Has anyone else had to deal with a similar issue?
-
Here's one thought to start the quest. ID if the migration was done correctly.
eg If you had a site that was example.com/mens did the 301 look like newsite.com/mens? If not then you might be having tons of issues with a bad planned migration.
-
The WMT notion helps. Thank you.
The main concern is really timing. Are there any effective ways of going through thousands of 404 pages and finding valuable redirects?
-
404s are not founds which are fine if they are really not found and there isn't a different url to point the original page to. One big issue could be that during the migration the old pages weren't 301'd which would result in tons of 404s.
Go through the 404s and see if they are issues or just relics from old data. Then you can mark in fixed in WMTs.
Hope that helps
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
4xx errors
I"m trying to fix 4xx errors but I"m not finding the pages in my admin. Where can I find this page? https://cracklefireplaces.com/collections/ethanol-wall-mounted-fireplaces/products/ignis-maximum-wall-mounted-ethanol-fireplace
Technical SEO | | carlbrekjern0 -
404 Errors for Form Generated Pages - No index, no follow or 301 redirect
Hi there I wonder if someone can help me out and provide the best solution for a problem with form generated pages. I have blocked the search results pages from being indexed by using the 'no index' tag, and I wondered if I should take this approach for the following pages. I have seen a huge increase in 404 errors since the new site structure and forms being filled in. This is because every time a form is filled in, this generates a new page, which only Google Search Console is reporting as a 404. Whilst some 404's can be explained and resolved, I wondered what is best to prevent Google from crawling these pages, like this: mydomain.com/webapp/wcs/stores/servlet/TopCategoriesDisplay?langId=-1&storeId=90&catalogId=1008&homePage=Y Implement 301 redirect using rules, which will mean that all these pages will redirect to the homepage. Whilst in theory this will protect any linked to pages, it does not resolve this issue of why GSC is recording as 404's in the first place. Also could come across to Google as 100,000+ redirected links, which might look spammy. Place No index tag on these pages too, so they will not get picked up, in the same way the search result pages are not being indexed. Block in robots - this will prevent any 'result' pages being crawled, which will improve the crawl time currently being taken up. However, I'm not entirely sure if the block will be possible? I would need to block anything after the domain/webapp/wcs/stores/servlet/TopCategoriesDisplay?. Hopefully this is possible? The no index tag will take time to set up, as needs to be scheduled in with development team, but the robots.txt will be an quicker fix as this can be done in GSC. I really appreciate any feedback on this one. Many thanks
Technical SEO | | Ric_McHale0 -
How to Handle Subdomains with Irrelevant Content
Hi Everyone, My company is currently doing a redesign for a website and in the process of planning their 301 redirect strategy, I ran across several subdomains that aren't set up and are pointing to content on another website. The site is on a server that has a dedicated IP address that is shared with the other site. What should we do with these subdomains? Is it okay to 301 them to the homepage of the new site, even though the content is from another site? Should we try to set them up to go to the 404 page on the new site?
Technical SEO | | PapercutInteractive0 -
What is the best way to handle these duplicate page content errors?
MOZ reports these as duplicate page content errors and I'm not sure the best way to handle it. Home
Technical SEO | | ElykInnovation
http://myhjhome.com/
http://myhjhome.com/index.php Blog
http://myhjhome.com/blog/
http://myhjhome.com/blog/?author=1 Should I just create 301 redirects for these? 301 http://myhjhome.com/index.php to http://myhjhome.com/ ? 301 http://myhjhome.com/blog/?author=1 to http://myhjhome.com/ ? Or is there a better way to handle this type of duplicate page content errors? and0 -
Getting Errors On Server Connectivity-??
Hi Guys I am getting a massive crawl errors on googlewebmaster ,stating there is over 2162 errors connect time out - anyone know where I can see exactly where the time out is from? I have browsed through my site and I do not see any connect timeout occured. Thanks Cary
Technical SEO | | ilovebodykits1 -
DNS error on webmaster tool
Google webmaster tool is showing DNS error and that is leading to many server error (502,500) almost 50+ in every crawl. Recently Google crawled one of our sub domains that we did not want google to crawl. We blocked it via Robots.txt and also removed all the URL's and since then we are having this issue. Any suggestions how to fix this DNS error? Thanks in advance.
Technical SEO | | tpt.com0 -
How to handle lots of URL parameters
Howdy mozzers I'm hoping you can lend some advice. I'm dealing with a site now with loads of URL parameters. It's a vehicle dealership group which hosts its entire inventory from multiple locations on one page, sorted by parameters. Example inventory URL: www.dealership.com/car-inventory.asp?pa=&ns=10&so=m&sor=DESC&ma=&mod=&mt=&yr=&bs=&pr=&t=used&ln= Where pa (page no.); ns (number of vehicles shown); so (sort by condition); sor (sort order); ma (make); mod (model); yr (year); bs (body style); pr (price range); t (type - new, used, etc.); ln (location no.). As you can imagine this generates a gazillion URLs (or slightly less). Any thoughts on best canonicalization options? Thanks as always
Technical SEO | | jamesm5i0 -
Site Change of Address - best method?
When changing domains, there's the obvious anxiety about sacrificing the value of your old domain. A client recently changed domains, immediately killed the old site (did everything properly with 301s, Webmaster Tools etc etc etc) and lost rankings completely for weeks. Turns out the site had been 'burnt' by the previous owner and it took a reconsideration request from Google before things recovered. Cost them rankings and cash with extra PPC spend. My question is: In order to avoid this potential hazard, what are your thoughts on submitting a change of address in Webmaster tools, but then leaving old site live for a few weeks to see how things pan out? I have never tried it and it seems to go against the grain, but interested to hear other people's experiences and how they have managed to change domain with minimal temporary damage. Thanks.
Technical SEO | | RiceMedia0