How Best to Handle Inherited 404s on Purchased Domain
-
We purchased a domain from another company and migrated our site over to it very successfully. However, we have one artifact of the original domain in that there was a page that was exploited by other sites on the web. This page allowed you to pass any URL to it and redirect to that URL (e.g. http://example.com/go/to/offsite_link.asp?GoURL=http://badactor.com/explicit_content).
This page does not exist on our site so the results always go to a 404 on our site. However, we find that crawlers are still attempting to access these invalid pages.
We have disavowed as many of the explicit sites as we can, but still some crawlers come looking for those links. We are considering blocking the redirect page in our robots.txt but we are concerned that the links will remain indexed but uncrawlable.
What's the best way to pull these pages from search engines and never have them crawled again?
UPDATE: Clarifying that what we're trying to do it get search engines to just never try to get to these pages. We feel the fact they're even wasting their time on getting a 404 is what we're trying to avoid. Is there any reason we shouldn't just block these in our robots.txt?
-
@gastonriera calm down mate. We have actually tested this at not seen any negative effect on any site we have done it on. It is the "easiest" option, but it won't cause the death and destruction your comment implies. Good day sir.
-
Hi there,
I'm considering that you have over 500k URLs, to be worrying about crawl efficiency. If you have less than that, please don't worry.
Having 404s is completely fine, and google will eventually lower their crawl frequency to those pages.
Blocking them in robots.txt will cause to google stop crawling them, but never to never remove them from the index.
My advice here: don't block them in robots.txtAs Rajesh pointed out, you could force those 404s into 410 to tell Google that they are gone forever. Yet, Google said that they treat 404s and 410s as the same.
John Mueller said over a year ago that 4xx status codes don't incur in crawl wastage. You can check it our in these Webmasters hangout notes - DeepcrawlHope it helps,
Best luck.
Gaston -
FOR THE LOVE OF GOD DONT REDIRECT 404s TO THE HOME!
This is terrible advice. Doing that you'll turn those 404s into soft 404s, making them more problematic than ever.
-
I would actually recommend redirecting it to the homepage. If you have a Wordpress website and a bunch of 404 pages, you can install a free plugin called "All 404 to Homepage" and this will solve the problem. I would, however, recommend that if you have replacement pages or pages covering similar content, that you redirect those to the corresponding replacement page.
-
You need to do one thing with those 404 pages. Move them as 410 status code. Redirection is not good practice for the same.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Old domain with HTTPS to new domain with HTTPS
I am a bit boggled about https to https we redirected olddomain.com to https://www.newdomain.com, but redirecting https://www.olddomain.com or non-www is not possible. because the certificate does not exist on a level where you are redirecting. only if I setup a new host and add a htaccess file will this work. What should I do? just redirect the rest and hope for the best?
Intermediate & Advanced SEO | | waqid0 -
Gradual Increase in Domain Authority After Domain Migration But No Improvement in Organic Traffic Yet
We migrated our domain in early April and simultaneously added an SSL certificate. Everything was done by the books. All redirects implemented perfectly, very few errors. Google notified via Search Console. Despite all steps being done perfectly our domain authority dropped from 24 to 8. Organic traffic dropped from about 80 per day to about 10. Each month domain authority increases by 2 or 3. We are now back up to a DA of 16. But no improvement in organic traffic yet. At what point should organic traffic start to return? Hopefully the consistent improvement in DA is a good sign. I have been told that adding SSL and moving the domain at the same time was a very bad idea. We are starting link building next week. Hopefully that will help further. Any ideas as to when this situation will improve? Needless to say it has been awful for our business.
Intermediate & Advanced SEO | | Kingalan10 -
Redirecting to a new domain... a second time
Hi all, I help run a website for a history-themed podcast and we just moved it to its second domain in 7 years. We've had very good SEO up until last week, and I'm wondering if I screwed up the way I redirected the domains. It's like this: Originally the site was hosted at "first.com", and it acquired inbound links. However, we then started to host the site on blogger, so we... Redirected the site to "second.blogspot.com". (Thus, 1 --> 2) It stayed here for about 7 years and got lots of traffic. Two weeks ago we moved it off of blogger and into Wordpress, so we 301 redirected everything to... third.com. (Thus, 1 --> 2 --> 3) The redirects worked, and when we Google individual posts, we are now seeing them in Google's index at the new URL. My question: What about the 1--> 2 redirect? There are still lots of links pointing to "first.com". Last week I went into my GoDaddy settings and changed the first redirect, so that first.com now points to third.com. (Thus 1 --> 3, and 2-->3) I was correct in doing that, right? The drop in Google traffic I've seen this past week makes me think that maybe I screwed something up. Should we have kept 1 --> 2 --> 3? (Again, now we have 1-->3 and 2-->3) Thanks for any insights on this! Tom
Intermediate & Advanced SEO | | TomNYC1 -
Best way to handle page filters and sorts
Hello Mozzers, I have a question that has to do with the best way to handle filters and sorts with Googlebot. I have a page that returns a list of widgets. I have a "root" page about widgets and then filter and sort functionality that shows basically the same content but adds parameters to the URL. For example, if you filter the page of 10 widgets by color, the page returns 3 red widgets on the top, and 7 non-red widgets on the bottom. If you sort by size, the page shows the same 10 widgets sorted by size. We use traditional php url parameters to pass filters and sorts, so obviously google views this as a separate URL. Right now we really don't do anything special in Google, but I have noticed in the SERPs sometimes if I search for "Widgets" my "Widgets" and "Widgets - Blue" both rank close to each other, which tells me Google basically (rightly) thinks these are all just pages about Widgets. Ideally though I'd just want to rank for my "Widgets" root page. What is the best way to structure this setup for googlebot? I think it's maybe one or many of the following, but I'd love any advice: put rel canonical tag on all of the pages with parameters and point to "root" use the google parameter tool and have it not crawl any urls with my parameters put meta no robots on the parameter pages Thanks!
Intermediate & Advanced SEO | | jcgoodrich0 -
Best way to noindex an image?
Hi all, A client wanted a few pages noindexed, which was no problem using the meta robots noindex tag. However they now want associated images removed, some of which still appear on pages that they still want indexed. I added the images to their robots.txt file a few weeks ago (probably over a month ago actually) but they're all still showing when you do an image search. What's the best way to noindex them for good, and how do I go about implementing it? Many thanks, Steve
Intermediate & Advanced SEO | | steviephil0 -
URL or Domain length
Hi All, I am wondering if google still does give importance to the length of the domain or url. If yes then how much is the acceptable length of a domain and URL. Many Thanks!
Intermediate & Advanced SEO | | HiteshBharucha0 -
What are the best suites of SEO tools?
I normally use SEOmoz and a bit of SEMrush but I dont really know much outside of those two. Im looking to do a review of the big, trustworthy ones - along the lines of free trial price vs value ranktracking linkbuilding help onpage analysis and help competitor analysis reports I heard good things about Raven Tools and Web CEO. Ive seen mention of SEOpowersuite on this forum but the site looks spammy as hell Anyone have a view on those 5 tools or any others in a similar vein? Or any other top line criteria I should be looking at? Cheers
Intermediate & Advanced SEO | | firstconversion
Stephen1 -
What is the best practice when a client is setting up multiple sites/domains
I have a client that is creating separate websites to be used for different purposes. What is the best practice here with regards to not looking spammy. i.e. do the domains need to registered with different companies? hosted on different servers, etc? Thanks in advance for your response.
Intermediate & Advanced SEO | | Dan-1718030