Thousands of 404 Pages Indexed - Recommendations?
-
Background: I have a newly acquired client who has had a lot of issues over the past few months.
What happened is he had a major issue with broken dynamic URL's where they would start infinite loops due to redirects and relative links. His previous SEO didn't pay attention to the sitemaps created by a backend generator, and it caused hundreds of thousands of pages to be indexed. Useless pages.
These useless pages were all bringing up a 404 page that didn't have a 404 server response (it had a 200 response) which created a ton of duplicate content and bad links (relative linking).
Now here I am, cleaning up this mess. I've fixed the 404 page so it creates a 404 server response. Google webmaster tools is now returning thousands of "not found" errors, great start. I fixed all site errors that cause infinite redirects. Cleaned up the sitemap and submitted it.
When I search site:www.(domainname).com I am still getting an insane amount of pages that no longer exist.
My question: How does Google handle all of these 404's? My client wants all the bad pages removed now but I don't have as much control over that. It's a slow process getting Google to remove these pages that are returning a 404. He is continuously dropping in rankings still.
Is there a way of speeding up the process? It's not reasonable to enter tens of thousands of pages into the URL Removal Tool.
I want to clean house and have Google just index the pages in the sitemap.
-
yeah all of the 301's are done - but I am trying to get around submitting tens of thousands of URL's to the URL removal tool.
-
Make sure you pay special attention to implementing the correct rel canonical was first introduced we wanted to be a little careful. We didn’t want to open it up for potential abuse so you could only use rel canonical within one domain. The only exception to that was you could do between IP addresses and domains.
But over time we didn’t see people abusing it a lot and if you think about it, if some evil malicious hacker has hacked your website and he’s going to do something to you he’s probably going to put some malware on the page or do a 301 redirect. He’s probably not patient enough to add a rel canonical and then wait for it to be re-crawled and re-indexed and all that sort of stuff.
So we sort of saw that there didn’t seem to be a lot of abuse. Most webmasters use rel canonical in really smart ways. We didn’t see a lot of people accidentally shooting themselves in the foot, which is something we do have to worry about and so a little while after rel canonical was introduced we added the ability to do cross domain rel canonical.
It basically works essentially like a 301 redirect. If you can do a 301 redirect that is still preferred because every search engine knows how to handle those and new search engines will know how to process 301s and permanent redirects.
But we do take a rel canonical and if it’s on one domain and points to another domain we will typically honor that. We always reserve the right to sort of hold back if we think that the webmaster is doing something wrong or making a mistake but in general we will almost always abide by that.
Hope that helps.
I had I have a client who unfortunately had a dispute with her prior IT person and the person made a mess of the site. It is not the quickest thing and I do agree 301 redirects are by far the quickest way to go about it. If you're getting 404 errors and the site is passing link juice. You're going to want to redirect those scattered about the website to the most relevant page.
http://jamesmartell.com/matt-cutts/how-does-google-handle-not-found-pages-that-do-not-return-a-404/
http://www.seroundtable.com/404-links-google-15427.html
http://support.google.com/customsearch/bin/topic.py?hl=en&topic=11493&parent=1723950&ctx=topic
https://developers.google.com/custom-search/docs/indexing
https://developers.google.com/custom-search/docs/api
I hope I was of help to you,
Thomas
-
Have you redirected (301) to appropriate landing pages ? After redirection, use URL removal tool. Its work great for me, its shows the result in 24 hours to me. Its removes all the URLs from Google index that I have submitted into it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is Google no longer Indexing and Ranking my state pages with Dynamic Content?
Hi, We have some state specific pages that display dynamic content based on the state that is selected here. For example this page displays new york based content. But for some reason google is no longer ranking these pages. Instead it's defaulting to the page where you select the state here. But last year the individual state dynamic pages were ranking. The only change we made was move these pages from http to https. But now google isn't seeing these individual dynamically generated state based pages. When I do a site: url search it doesn't find any of these state pages. Any thoughts on why this is happening and how to fix it. Thanks in advance for any insight. Eddy By the way when I check these pages in google search console fetch as google, google is able to see these pages fine and they're not being blocked by any robot.txt.
Intermediate & Advanced SEO | | eddys_kap0 -
301 / 404 & Getting Rid of Keyword Pages
I had a feeling that my keyword focused pages were causing my site not to rank well. I do not have that many keywords. I have 2 main keyword phrases along with 6 city locations. For example (fake) "tea house tampa" "tea house clearwater" "tea house sarasota" and "tea room tampa" "tea room cleawater" "tea house sarasota". So, I don't feel that I need that many pages. I feel like I can optimize my home page and maybe 1 or 2 topic pages. Right now, I have a keyword for each of those phrases. These are all internal pages on 1 domain. Not multiple domains. Sooo... I tested it by 301ing a few of my "tea house" KW pages to the home page. And low and behold... my home page rose BIG TIME! Major improvement! I'm talking like 13th to 2nd! Here is my question... how should I proceed? My SEO has warned me against 301ing too many pages all pointing to the home page. He says that will negatively impact my ratings. Should I 404 some pages? Should I build a "tea room" topic page and 301 that set there? What is worse? 301 or 404? How many is too many? I'm really excited by these results, but I'm scare to move forward and hurt what has happened. Thanks in advance!
Intermediate & Advanced SEO | | CalicoKitty20000 -
I've seen and heard alot about city-specific landing pages for businesses with multiple locations, but what about city-specific landing pages for cities nearby that you aren't actually located in? Is it ok to create landing pages for nearby cities?
I asked here https://www.google.com/moderator/#7/e=adbf4 but figured out ask the Moz Community also! Is it actually best practice to create landing pages for nearby cities if you don't have an actual address there? Even if your target customers are there? For example, If I am in Miami, but have a lot of customers who come from nearby cities like Fort Lauderdale is it okay to create those LP's? I've heard this described as best practice, but I'm beginning to question whether Google sees it that way.
Intermediate & Advanced SEO | | RickyShockley2 -
Incorrect cached page indexing in Google while correct page indexes intermittently
Hi, we are a South African insurance company. We have a page http://www.miway.co.za/midrivestyle which has a 301 redirect to http://www.miway.co.za/car-insurance. Problem is that the former page is ranking in the index rather than the latter. The latter page does index occasionally in the same position, but rarely. This is primarily for search phrases like "car insurance" and "car insurance quotes". The ranking was knocked down the index with Penquin 2.0. It was not ranking at all but we have managed to recover to 12/13. This abnormally has only been occurring since the recovery. The correct page does index for other search terms like "insurance for car". Your help would be appreciated, thanks!
Intermediate & Advanced SEO | | miway0 -
Page indexed but not showing up at all in search results
I am currently working on the SEO for a roofing company. I have developed GEO targeted pages for both commercial and residential roofing (as well as attic insulation and gutters) and have hundreds of 1st page placements for the GEO targeted keywords. What is baffling me is that they are performing EXTREMELY poorly on the bigger cities, to the point of not evening showing up in the first 5 pages. I also target a page specifically for roof repair in Phoenix and it is not coming up AT ALL. This is not typically the results I get when directly targeting keywords. I'm working on implementing keyword variations as well as adding about 10 or so information pages (@ 700 words) regarding different roofing systems which I plan to cross link on the site, etc. I'm just wondering if there is a simple answer as to why the pages I want to be showing up the most are performing so poorly and what I would need to do to improve their rankings.
Intermediate & Advanced SEO | | dogstarweb0 -
Volusion store product pages will not index
Hello, I have moved over to Volusion and was wondering if you guys know of any SEO practices that are Volusion specific. i have been working on this site now for 2 months and my impressions and rankings have dropped substantially My 301 redirects where in place before I flipped over and my keywords / titles/ tags etc.. are in place. However i am still not making any progress in the engines. I have noticed that my products are not being indexed per Webmaster tools. I have heard that volusion has something set up to where you must purchase their SEO package in order to rank. I am really at my wits end and currently I thinking about taking a loss and reverting back to my old Shoppe Pro site. Any help would be very appreciated
Intermediate & Advanced SEO | | kerry0217
.0 -
E Commerce product page canonical and indexing + URL parameters
Hi, I'm having some issues on the best way to handle site structure. The technical side of SEO isn't my strong point so I thought I'd ask the question before I make the decision. Two examples for you to look at. This is a new site http://www.tester.co.uk/electrical/multimeters/digital. By selecting another page to see more products you get this url string where/p/2. This page also has the canonical tag relating to this page and not the original page. Now if say for example I exclude this parameter (where) in webmaster tools will I be stopping Google indexing the products on the other pages where/p/2, 3, 4 etc. and the same if I make the canonical point to multimeters/digital/ instead of multimeters/digital/where/p/2 etc.? I have the same question applied to the older site http://www.pat-services.co.uk/digital-multimeters-26.html. which no longer has an canonical tags at all. The only real difference is Google is indexing http://www.pat-services.co.uk/digital-multimeters-26.html?page=2 but not http://www.tester.co.uk/electrical/multimeters/digital/where/p/2 Thanks for help in advance
Intermediate & Advanced SEO | | PASSLtd0 -
Should you stop indexing of short lived pages?
In my site there will be a lot of pages that have a short life span of about a week as they are items on sale, should I nofollow the links meaning the site has a fwe hundred pages or allow indexing and have thousands but then have lots of links to pages that do not exist. I would of course if allowing indexing make sure the page links does not error and sends them to a similarly relevant page but which is best for me with the SEarch Engines? I would like to have the option of loads of links with pages of loads of content but not if it is detrimental Thanks
Intermediate & Advanced SEO | | barney30120