Find all external 404 errors/links?
-
Hi All,
We have recently discovered a site was linking to our site but it was linking to an incorrect url, resulting in a 404 error. We had only found this by pure chance and wondered if there was a tool out there that will tell us when a site is linking to an incorrect url on our site?
Thanks
-
If you dont have access to the logs that could be an issue - not really any automated tools out there as it would need to crawl every website and find 404 errors.
I haven't tried this - so its just an idea. Go into GSC download all the links pointing to your site (and from places like Moz, Ahrefs, Majestic) and then chuck that list of urls into Screaming Frog or URL Profiler and look at external links and see if any are returning a 404. Not sure if this would work - its just an idea.
Thanks
Andy
-
Great, will take a look. Maybe run a trial to see if it does exactly what I need
Thanks for the info!
-
Good idea!
Although some of our clients that we do SEO for aren't hosting their websites on our server and we don't have access to their server logs etc.
Was hoping for an automated dashboard like MOZ/Screaming Frog/ or A hrefs as mentioned above. Due to the amount of clients we have, opening up and running through all there Log files could be time consuming.
Cheers for the info though, may come in use in the future, or to someone else reading this
-
Hi
The best way I have found is to look in your server logs, its the only true place to find out what Google is doing on your site.
Download the logs and look at all the 404 errors - quite simple and depending on size of your logs can take you around 5 minutes worth a work - the longer time period you can analyse in your logs the better.
Thanks
Andy
-
Hi David.
Ahrefs.com offers that service: broken links.
Another way to do that search could be this: Downloading the historic backlinks list and with a mass checker, check where do they point nowdays. I've used GScraper and its option to crawl outbound links.
Best Luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Confused about repeated occurences of URL/essayorg/topic/ showing up as 404 errors in our site logs
Working on a Wordpress website, https://thedoctorwithin.comScanning the site’s 404 errors, I’m seeing a lot of searches for URL/essayorg/topic, coming from Bingbot, as well as other spiders (Google, OpensiteExlorer). We get at least 200 of these irrelevant requests per week. Seems like each topic that follows /essayorg/ is unique. Some include typos: /dissitation/Haven't done a verification to make sure the spiders are who they say they are, yet.Almost seems like there are many links ‘in the wild’ intended for Essay.Org that are being directed towards the site I’m working on.I've considered redirecting any requests for URL/essayorg/ to our sitemap… figuring that might encourage further spidering of actual site content. Is redirection to our sitemap xml file a good idea, or might doing so have unintended consequences? Interested in suggestions about why this might be occurring. Thank you.
Technical SEO | | linkjuiced0 -
500 - server error
Hi All, A site crawl reveals several server errors (status code 500) about a clients wordpress website. My question: what are the most common causes for server errors and what advice can I give about how to fix them? Thanks in advance,
Technical SEO | | WeAreDigital_BE
Jens0 -
Salvaging links from WMT “Crawl Errors” list?
When someone links to your website, but makes a typo while doing it, those broken inbound links will show up in Google Webmaster Tools in the Crawl Errors section as “Not Found”. Often they are easy to salvage by just adding a 301 redirect in the htaccess file. But sometimes the typo is really weird, or the link source looks a little scary, and that's what I need your help with. First, let's look at the weird typo problem. If it is something easy, like they just lost the last part of the URL, ( such as www.mydomain.com/pagenam ) then I fix it in htaccess this way: RewriteCond %{HTTP_HOST} ^mydomain.com$ [OR] RewriteCond %{HTTP_HOST} ^www.mydomain.com$ RewriteRule ^pagenam$ "http://www.mydomain.com/pagename.html" [R=301,L] But what about when the last part of the URL is really screwed up? Especially with non-text characters, like these: www.mydomain.com/pagename1.htmlsale www.mydomain.com/pagename2.htmlhttp:// www.mydomain.com/pagename3.html" www.mydomain.com/pagename4.html/ How is the htaccess Rewrite Rule typed up to send these oddballs to individual pages they were supposed to go to without the typo? Second, is there a quick and easy method or tool to tell us if a linking domain is good or spammy? I have incoming broken links from sites like these: www.webutation.net titlesaurus.com www.webstatsdomain.com www.ericksontribune.com www.addondashboard.com search.wiki.gov.cn www.mixeet.com dinasdesignsgraphics.com Your help is greatly appreciated. Thanks! Greg
Technical SEO | | GregB1230 -
Fixing Crawl Errors
Hi! I moved my Wordpress blog back in August, and lost much of my site traffic. I recently found over 1000 crawl errors in Webmaster Tools because some of my redirects weren't transferred, so we are working on fixing the errors and letting Google know. I'm wondering how long I should expect for Google to recognize that the errors have been fixed and for the traffic to start returning? Thanks! Jodi - momsfavoritestuff.com
Technical SEO | | JodiFTM0 -
Link Detox
Hey guys, I'm currently working on cleaning up our link profile and have been looking at several tools. Has any one used this from http://www.linkresearchtools.com do you think its worth investing in? Matthew
Technical SEO | | EwanFisher0 -
Do you get credit for an external link that points to a page that's being blocked by robots.txt
Hi folks, No one, including me seems to actually know what happens!? To repeat: If site A links to /home.html on site B and site B blocks /home.html in Robots.txt, does site B get credit for that link? Does the link pass PageRank? Will Google still crawl through it? Does the domain get some juice, but not the page? I know there's other ways of doing this properly, but it is interesting no?
Technical SEO | | DaveSottimano0 -
I have a site that has both http:// and https:// versions indexed, e.g. https://www.homepage.com/ and http://www.homepage.com/. How do I de-index the https// versions without losing the link juice that is going to the https://homepage.com/ pages?
I can't 301 https// to http:// since there are some form pages that need to be https:// The site has 20,000 + pages so individually 301ing each page would be a nightmare. Any suggestions would be greatly appreciated.
Technical SEO | | fthead90 -
No. of links on a page
Is it true that If there is a huge number of links from the source page then each link will provide very little value in terms of passing link juice ?
Technical SEO | | seoug_20050