404's in WMT are old pages and referrer links no longer linking to them.
-
Within the last 6 days, Google Webmaster Tools has shown a jump in 404's - around 7000. The 404 pages are from our old browse from an old platform, we no longer use them or link to them.
I don't know how Google is finding these pages, when I check the referrer links, they are either 404's themselves or the page exists but the link to the 404 in question is not on the page or in the source code. The sitemap is also often referenced as a referrer but these links are definitely not in our sitemap and haven't been for some time. So it looks to me like the referrer data is outdated. Is that possible?
But somehow these pages are still being found, any ideas on how I can diagnose the problem and find out how google is finding them?
-
How long ago did you switch platforms? It can take months for Google to come back around to a page that linked to your site. Page on your site will stay in the cache until a few passes.
When you switch, did you do any 301 redirects? Examine the back links to your domain - any that come from good pages should be redirected to the new URL. If not, they will be scooped up by active SEOs. (finding 404 links is a popular link building technique).
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=93633
If you know the links will be dead forever, try using a 410 response as it is supposed to make search engines drop the page faster.
http://www.seroundtable.com/404-410-google-15225.html (bottom)
Have you requested Google remove old directories/pages? If the content is gone and has no back links, try a removal request.
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663427
-
Having a similar problem with a new site that was created by copying an old site in its entirety. Went through the trouble of cleaning everything up, having pages that were no longer relevant removed, fixed the sitemaps, etc. and now months later WMT showed me a spike of 404s for the old pages with the referrers as the XML sitemap and sitemap page... but they are definitely not be linked from there. I'm assuming there was some sort of hiccup with Google using an older, cached version of the sitemap to find these links.
I wound up just clearing the errors out of WMT and waiting to see if it will recrawl the error pages again. If Google continues to crawl them even though they aren't being linked to, then our next course of action was going to be 301ing them all just in case.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to get into Google's Tops Stories?
Hi All, I have been doing research for a few weeks and I cannot for the life of me figure out why I cannot get my website (Racenet) into the top stories in Google. We are in Google News, have "news article" schema, have AMP pages. Our news articles also perform quite well organically and we typically dominate the Google News section. We have two main competitors (Punters and Just Horse Racing) who are both in top stories and I cannot find anything that we are doing that they aren't. Apparently the AMP "news article" schema is incorrect and that could be the reason why we aren't showing up in Google Top Stories, but I can't find anything wrong with the schema and it looks the same as our competitors. For example: https://search.google.com/structured-data/testing-tool/u/0/#url=https%3A%2F%2Fwww.racenet.com.au%2Fnews%2Fblake-shinn-booked-to-ride-doncaster-handicap-favourite-alizee-20190331%3FisAmp%3D1 Does anyone have any ideas of why I cannot get my site into Google Top Stories? Any and all help would be greatly appreciated. Thanks! 🙂
Technical SEO | | Saba.Elahi.M.0 -
Why is Google Webmaster Tools showing 404 Page Not Found Errors for web pages that don't have anything to do with my site?
I am currently working on a small site with approx 50 web pages. In the crawl error section in WMT Google has highlighted over 10,000 page not found errors for pages that have nothing to do with my site. Anyone come across this before?
Technical SEO | | Pete40 -
GWT returning 200 for robots.txt, but it's actually returning a 404?
Hi, Just wondering if anyone has had this problem before. I'm just checking a client's GWT and I'm looking at their robots.txt file. In GWT, it's saying that it's all fine and returns a 200 code, but when I manually visit (or click the link in GWT) the page, it gives me a 404 error. As far as I can tell, the client has made no changes to the robots.txt recently, and we definitely haven't either. Has anyone had this problem before? Thanks!
Technical SEO | | White.net0 -
Too Many On-Page Links?
How much would this affect my page ranks performance? There are many Too Many On-Page Links? warning on my campaign. should I address this issue right away to fix it or leave it as it would not matter seriously ? I've looked at some of the pages and think all of them are necessary. Could someone help me? Thanks!
Technical SEO | | LauraHT0 -
I know I'm missing pages with my page level 301 re-directs. What can I do?
I am implementing page level re-directs for a large site but I know that I will inevitably miss some pages. Is there an additional safety net root level re-direct that I can use to catch these pages and send them to the homepage?
Technical SEO | | VMLYRDiscoverability0 -
We registered with Yahoo Directory. Why won't this show up as a a linking root domain in our link analysis??
Recently checked our link analysis report for 2 of our campaigns who are registered in the dir.yahoo.com (yahoo directory). For some reason, we don't see this being a domain that shows up as linking to our website - why is this?
Technical SEO | | MMP0 -
Google's "cache:" operator is returning a 404 error.
I'm doing the "cache:" operator on one of my sites and Google is returning a 404 error. I've swapped out the domain with another and it works fine. Has anyone seen this before? I'm wondering if G is crawling the site now? Thx!
Technical SEO | | AZWebWorks0 -
Duplicate pages, overly dynamic URL’s and long URL’s in Magento
Hi there, I’ve just completed the first crawl of my Magento site and SEOMOZ has picked up 1,000’s of duplicate pages, overly dynamic URL’s and long URL’s due to the sort function which appends URL’s with variables when sorting products (e.g. www.example.com?dir=asc&order=duration). I’m not particularly concerned that this will affect our rankings as Google has stated that they are familiar with the structure of popular CMS’s and Magento is pretty popular. However it completely dominates my crawl diagnostics so I can’t see if there are any real underlying issues. Does anyone know a way of preventing this? Cheers,
Technical SEO | | WendyWuTours
Al.1