Thousands of 404s
-
Hi there,
I'm working on a site that has a ridiculous number of 404s being returned by webmaster tools. We believe this was because there was an onpage error that was amending the urls and adding in folders that shouldn't have been in a big spiral i.e. /salons/uk/teeth became something like /salons/uk/teeth/salons/edinburgh/hair/teeth...
Anyway, we think the issue is now sorted, but these pages were indexed it seems, and so it looks like Google is still searching for them when it crawls the site. What's my best move? It's the sheers volume (over 13,000) that has me concerned so I thought it best to seek some expert advice before continuing.
Thanks in advance!
-
As it's all sorted now, I really wouldn't worry about them too much. You can use the remove URL functionality in WMT, but this is a manual process so I wouldn't do this. If I were in your position, I'd probably just let the pages keep 404ing'. After a bit, Google will usually stop trying to recrawl the 404 pages. Right now they are probably trying to recrawl incase the 404 was an accident.
If it's causing a bandwidth problem, you can solve with a robots.txt as suggested earlier.
-
Hi Philip!
If these URL's are already indexed, you should 301 Redirect them to the right URL (if they by chance have some inbound links). You could also try the URL removal tool from Google (see https://support.google.com/webmasters/answer/1663416) if all you want is to get rid of them.
Good luck, hope this helps.
//Anders
-
Hi Philip,
If all the urls have the same URL pattern, I would give it a try adding the structure to the robots.txt so you'll prevent Google from crawling the pages. Even better would be if you could add the noindex tags to the page.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404s on subfolder - how to redirect?
Hi all,
Technical SEO | | MFSMarketing
we have a lot of 404s to subfolders. Eg
www.website.com/blog-post-title/imagename/
www.website.com/blog-post-title/author/ We don't have these subfolders or blog posts anymore.
How do i redirect them? These links (404s) don't seem to have any value or backlinks. Thanks,
Stef0 -
Sitemap.xml strategy for site with thousands of pages
I have a client that has a HUGE website with thousands of product pages. We don't currently have a sitemap.xml because it would take so much power to map the sitemap. I have thought about creating a sitemap for the key pages on the website - but didn't want to hurt the SEO on the thousands of product pages. If you have a sitemap.xml that only has some of the pages on your site - will it negatively impact the other pages, that Google has indexed - but are not listed on the sitemap.xml.
Technical SEO | | jerrico10 -
Huge number of crawl anomalies and 404s - non- existent urls
Hi there, Our site was redesigned at the end of January 2020. Since the new site was launched we have seen a big drop in impressions (50-60%) and also a big drop in total and organic traffic (again 50-60%) when compared to the old site. I know in the current climate some businesses will see a drop in traffic, however we are a tech business and some of our core search terms have increased in search volume as a result of remote-working. According to search console there are 82k urls excluded from coverage - the majority of these are classed as 'crawl anomaly' and there are 250+ 404's - almost all of the urls are non-existent, they have our root domain with a string of random characters on the end. Here are a couple of examples: root.domain.com/96jumblestorebb42a1c2320800306682 root.domain.com/01sportsplazac9a3c52miz-63jth601 root.domain.com/39autoparts-agency26be7ff420582220 root.domain.com/05open-kitchenaf69a7a29510363 Is this a cause for concern? I'm thinking that all of these random fake urls could be preventing genuine pages from being indexed / or they could be having an impact on our search visibility. Can somebody advise please? Thanks!
Technical SEO | | nicola-10 -
Sitemaps, 404s and URL structure
Hi All! I recently acquired a client and noticed in Search Console over 1300 404s, all starting around late October this year. What's strange is that I can access the pages that are 404ing by cutting and pasting the URLs and via inbound links from other sites. I suspect the issue might have something to do with Sitemaps. The site has 5 Sitemaps, generated by the Yoast plugin. 2 Sitemaps seem to be working (pages being indexed), 3 Sitemaps seem to be not working (pages have warnings, errors and nothing shows up as indexed). The pages listed in the 3 broken sitemaps seem to be the same pages giving 404 errors. I'm wondering if auto URL structure might be the culprit here. For example, one sitemap that works is called newsletter-sitemap.xml, all the URLs listed follow the structure: http://example.com/newsletter/post-title Whereas, one sitemap that doesn't work is called culture-event-sitemap.xml. Here the URLs underneath follow the structure http://example.com/post-title. Could it be that these URLs are not being crawled / found because they don't follow the structure http://example.com/culture-event/post-title? If not, any other ideas? Thank you for reading this long post and helping out a relatively new SEO!
Technical SEO | | DanielFeldman0 -
Are thousands of 404s a problem?
An ecommerce site I work on has around 16,000 URLs that are 404s in Webmaster Tools. The vast majority are for products that are no longer stocked by the site, which is a natural occurrence in ecommerce. But my question is, could these possibly be harming rankings?
Technical SEO | | creativemay1 -
403s vs 404s
Hey all, Recently launched a new site on S3, and old pages that I haven't been able to redirect yet are showing up as 403s instead of 404s. Is a 403 worse than a 404? They're both just basically dead-ends, right? (I have read the status code guides, yes.)
Technical SEO | | danny.wood1 -
Google indexing thousands crazy search results with %25253
In GWT I started seeing very strange pages indexed a few weeks, and Google is no reporting over 21,000 of pages (blocked by robots.txt) with weird URLs like this: http://www.francesphotography.com/?s=no-results:no-results%25252525252525253Ano-results%2525252525252525253Ano-results%252525252525252525253Ano-results%252525252525252525253Ano-results%252525252525252525253Ano-results%252525252525252525253Ano-results%25252525252525252525253Ano-results%25252525252525252525253Ano-results%2525252525252525252525253Adanna&cat=no-results http://www.francesphotography.com/?s=no-results:no-results%2525253Ano-results%25252525253Ano-results%25252525253Ano-results%25252525253Ano-results%2525252525253Ano-results%25252525252525253Ano-results%25252525252525253Ano-results%25252525252525253Adanna&cat=no-results The current robots.txt looks like this: User-agent: *
Technical SEO | | BoulderJoe
Disallow: /wp-content Disallow: /wp-admin Disallow: /wp-includes
Disallow: /data
Disallow: /slideshows
Disallow: /page/*/?s=
Disallow: /?s=
Disallow: /search This website is running an up to date WP install with Yoast's Google Analytics and SEO plug-in. I can't point to anything specific that happened with the site when these URLs started appearing even after I modified the robots.txt. What can be done to try and stop Google from creating and indexing these goofy URLs? I see lots of sites having this issue when I search in Google, but no one seems to have a solution.0 -
Images on page appear as 404s to Googlebot
When I fetch my website as Googlebot it returns 404s for all the images on the page. This despite the fact that each image is hyperlinked! What could be causing this issue? Thanks!
Technical SEO | | Netpace0