Thousands of 404 Pages Indexed - Recommendations?
-
Background: I have a newly acquired client who has had a lot of issues over the past few months.
What happened is he had a major issue with broken dynamic URL's where they would start infinite loops due to redirects and relative links. His previous SEO didn't pay attention to the sitemaps created by a backend generator, and it caused hundreds of thousands of pages to be indexed. Useless pages.
These useless pages were all bringing up a 404 page that didn't have a 404 server response (it had a 200 response) which created a ton of duplicate content and bad links (relative linking).
Now here I am, cleaning up this mess. I've fixed the 404 page so it creates a 404 server response. Google webmaster tools is now returning thousands of "not found" errors, great start. I fixed all site errors that cause infinite redirects. Cleaned up the sitemap and submitted it.
When I search site:www.(domainname).com I am still getting an insane amount of pages that no longer exist.
My question: How does Google handle all of these 404's? My client wants all the bad pages removed now but I don't have as much control over that. It's a slow process getting Google to remove these pages that are returning a 404. He is continuously dropping in rankings still.
Is there a way of speeding up the process? It's not reasonable to enter tens of thousands of pages into the URL Removal Tool.
I want to clean house and have Google just index the pages in the sitemap.
-
yeah all of the 301's are done - but I am trying to get around submitting tens of thousands of URL's to the URL removal tool.
-
Make sure you pay special attention to implementing the correct rel canonical was first introduced we wanted to be a little careful. We didn’t want to open it up for potential abuse so you could only use rel canonical within one domain. The only exception to that was you could do between IP addresses and domains.
But over time we didn’t see people abusing it a lot and if you think about it, if some evil malicious hacker has hacked your website and he’s going to do something to you he’s probably going to put some malware on the page or do a 301 redirect. He’s probably not patient enough to add a rel canonical and then wait for it to be re-crawled and re-indexed and all that sort of stuff.
So we sort of saw that there didn’t seem to be a lot of abuse. Most webmasters use rel canonical in really smart ways. We didn’t see a lot of people accidentally shooting themselves in the foot, which is something we do have to worry about and so a little while after rel canonical was introduced we added the ability to do cross domain rel canonical.
It basically works essentially like a 301 redirect. If you can do a 301 redirect that is still preferred because every search engine knows how to handle those and new search engines will know how to process 301s and permanent redirects.
But we do take a rel canonical and if it’s on one domain and points to another domain we will typically honor that. We always reserve the right to sort of hold back if we think that the webmaster is doing something wrong or making a mistake but in general we will almost always abide by that.
Hope that helps.
I had I have a client who unfortunately had a dispute with her prior IT person and the person made a mess of the site. It is not the quickest thing and I do agree 301 redirects are by far the quickest way to go about it. If you're getting 404 errors and the site is passing link juice. You're going to want to redirect those scattered about the website to the most relevant page.
http://jamesmartell.com/matt-cutts/how-does-google-handle-not-found-pages-that-do-not-return-a-404/
http://www.seroundtable.com/404-links-google-15427.html
http://support.google.com/customsearch/bin/topic.py?hl=en&topic=11493&parent=1723950&ctx=topic
https://developers.google.com/custom-search/docs/indexing
https://developers.google.com/custom-search/docs/api
I hope I was of help to you,
Thomas
-
Have you redirected (301) to appropriate landing pages ? After redirection, use URL removal tool. Its work great for me, its shows the result in 24 hours to me. Its removes all the URLs from Google index that I have submitted into it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No designated 404 page, but any made-up URL path displays homepage Good / Bad?
I have a custom website where if you type in companyxyz.com/_any-made-up-url _it displays the homepage. So then you will see the homepage and in the URL bar the made up URL path remains visible "companyxyz.com/any-made-up-url" Is this good or bad or not an issue?
Intermediate & Advanced SEO | | Rich_Coffman0 -
Removing pages from index
My client is running 4 websites on ModX CMS and using the same database for all the sites. Roger has discovered that one of the sites has 2050 302 redirects pointing to the clients other sites. The Sitemap for the site in question includes 860 pages. Google Webmaster Tools has indexed 540 pages. Roger has discovered 5200 pages and a Site: query of Google reveals 7200 pages. Diving into the SERP results many of the pages indexed are pointing to the other 3 sites. I believe there is a configuration problem with the site because the other sites when crawled do not have a huge volume of redirects. My concern is how can we remove from Google's index the 2050 pages that are redirecting to the other sites via a 302 redirect?
Intermediate & Advanced SEO | | tinbum0 -
Why is page still indexing?
Hi all, I have a few pages that - despite having a robots meta tag and no follow, no index, they are showing up in Google SERPs. In troubleshooting this with my team, it was brought up that another page could be linking to these pages and causing this. Is that plausible? How could I confirm that? Thanks,
Intermediate & Advanced SEO | | SSFCU
Sarah0 -
Google Is Indexing The Wrong Page For My Keyword
For a long time (almost 3 mounth) google indexing the wrong page for my main keyword.
Intermediate & Advanced SEO | | Tiedemann_Anselm
The problem is that each time google indexed another page each time for a period of 4-7 days, Sometimes i see the home page, sometimes a category page and sometimes a product page.
It seems though Google has not yet decided what his favorite / better page for this keyword. This is the pages google index: (In most cases you can find the site on the second or third page) Main Page: http://bit.ly/19fOqDh Category Page: http://bit.ly/1ebpiRn Another Category: http://bit.ly/K3MZl4 Product Page: http://bit.ly/1c73B1s All links I get to the website are natural links, therefore in most cases the anchor we got is the website name. In addition I have many links I get from bloggers that asked to do a review on one of my products, I'm very careful about that and so I'm always checking the blogger and their website only if it is something good, I allowed it. also i never ask for a link back (must of the time i receive without asking), and as I said, most of their links are anchor with my website name. Here some example of links that i received from bloggers: http://bit.ly/1hF0pQb http://bit.ly/1a8ogT1 http://bit.ly/1bqqRr8 http://bit.ly/1c5QeC7 http://bit.ly/1gXgzXJ Please Can I get a recommendation what should you do?
Should I try to change the anchor of the link?
Do I need to not allow bloggers to make a review on my products? I'd love to hear what you recommend,
Thanks for the help0 -
More Indexed Pages than URLs on site.
According to webmaster tools, the number of pages indexed by Google on my site doubled yesterday (gone from 150K to 450K). Usually I would be jumping for joy but now I have more indexed pages than actual pages on my site. I have checked for duplicate URLs pointing to the same product page but can't see any, pagination in category pages doesn't seem to be indexed nor does parameterisation in URLs from advanced filtration. Using the site: operator we get a different result on google.com (450K) to google.co.uk (150K). Anyone got any ideas?
Intermediate & Advanced SEO | | DavidLenehan0 -
De Index Section of Page?
Hey all! We're having a couple of issues with a certain section of our page that we don't want to index. Basically, our cross sells change really quickly, and big G is ranking them and linking to them even when they've long gone. Is it possible to put some kind of no index tag for a specific section of the page? See below 🙂 http://www.freestylextreme.com/uk/Home/Brands/DC-Shoe-Co-/Mens-DC-Shoe-Co-Hoodies-and-Sweaters/DC-Black-Rob-Dyrdek-Official-Sweater.aspx Thanks!
Intermediate & Advanced SEO | | elbeno0 -
404 Error on Blog Pages that Look Like Loading Fine
There was recently a huge increase in 404 errors on Yandex Webmasters corresponding with a drop in rankings. Most of the pages seem to be from my blog (which was updated around the same time). When I click on the links from Yandex the page looks like it is loading normal, expect that it has the following message from the Facebook plugin I am using for commenting Any ideas about what the problem is or how to fix it? Critical Errors That Must Be Fixed | Bad Response Code: | URL returned a bad HTTP response code. | Open Graph Warnings That Should Be Fixed | Inferred Property: | The 'og:url' property should be explicitly provided, even if a value can be inferred from other tags. |
Intermediate & Advanced SEO | | theLotter
| Inferred Property: | The 'og:title' property should be explicitly provided, even if a value can be inferred from other tags. |
| Small og:image: | All the images referenced by og:image should be at least 200px in both dimensions. Please check all the images with tag og:image in the given url and ensure that it meets the recommended specification. |0 -
Too many on page links - product pages
Some of the pages on my client's website have too many on page links because they have lists of all their products. Is there anything I should/could do about this?
Intermediate & Advanced SEO | | AlightAnalytics0