Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to find links to 404 pages?
-
I know that I used to be able to do this, but I can't seem to remember.
One of the sites I am working on has had a lot of pages moving around lately. I am sure some links got lost in the fray that I would like to recover, what is the easiest way to see links going to a domain that are pointing to 404 pages?
-
where is that little button next to my crawl warnings that lets me open urls, or explore links to that url using OSE?
-
Specifically in Open Site Explorer, check out the "Top Pages" tab to see if any of your top linked to pages are returning a 404. This tab is actually the first one I look at when running analysis of a site.
-
Sorry for the delay in my answer.
When you have detected all the 404 of your website, you can use the "Explore URL" search in Siteexplorer. If are still existing backlinks to those pages, Yahoo Siteexplorer will show them.
To be sure I just did a try with an 404 of a new client of mine, and just discovered that one 404 page was linked by a Yale University page... obviously I've just made an 301
-
As familiar as I am with Yahoo SiteExplorer I have never used it to find external links that go to pages that are no longer there. How can I do this with that tool?
-
Hello Spencer,
I recommend two tools
1. Xenu link sleuth (http://home.snafu.de/tilman/xenulink.html#Download)
2. Gsitecrawler ( http://gsitecrawler.com/en/download/)
Both will report all the linked pages throwing a 404 error and other status codes including "forbidden request", "no connection", "no such host" and more.
Hope this helps.
Sameer
-
Did you look into the Google Webmaster Tools already? There you can see them as well - of course not all. But you have to check from time to time - they don't show up all together. If you fix some perpaps some more will come up ...
-
Hi Spencer:
I don't know if this qualifies as the easiest way , but it ranks right up there:
-
You can use Open Site Explorer, but i suggest you to widen the discovery using also Yahoo! SiteExplorer
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best way to link to multiple location pages
I am a Magician and have multiple location pages for each county I cover. I currently have them linked off the menu under locations/ <county>and also in the footer</county> However I have heard that a link from the page is much stronger, so I am experimenting with removing the Menu & Footer link and just linking to these pages from within the content. It's not really a navigation item and most people come in through search to the right page. Am I diluting the link by having it in the Menu/Page and Footer? I read a long time ago that Google only considers the first link to a page and ignores the rest - is that the case? Thanks Roger https://www.rogerlapin.co.uk/
Technical SEO | | Rogerperk0 -
Can you use Screaming Frog to find all instances of relative or absolute linking?
My client wants to pull every instance of an absolute URL on their site so that they can update them for an upcoming migration to HTTPS (the majority of the site uses relative linking). Is there a way to use the extraction tool in Screaming Frog to crawl one page at a time and extract every occurrence of _href="http://" _? I have gone back and forth between using an x-path extractor as well as a regex and have had no luck with either. Ex. X-path: //*[starts-with(@href, “http://”)][1] Ex. Regex: href=\”//
Technical SEO | | Merkle-Impaqt0 -
Getting high priority issue for our xxx.com and xxx.com/home as duplicate pages and duplicate page titles can't seem to find anything that needs to be corrected, what might I be missing?
I am getting high priority issue for our xxx.com and xxx.com/home as reporting both duplicate pages and duplicate page titles on crawl results, I can't seem to find anything that needs to be corrected, what am I be missing? Has anyone else had a similar issue, how was it corrected?
Technical SEO | | tgwebmaster0 -
Is it good to redirect million of pages on a single page?
My site has 10 lakh approx. genuine urls. But due to some unidentified bugs site has created irrelevant urls 10 million approx. Since we don’t know the origin of these non-relevant links, we want to redirect or remove all these urls. Please suggest is it good to redirect such a high number urls to home page or to throw 404 for these pages. Or any other suggestions to solve this issue.
Technical SEO | | vivekrathore0 -
Should I disavow links from pages that don't exist any more
Hi. Im doing a backlinks audit to two sites, one with 48k and the other with 2M backlinks. Both are very old sites and both have tons of backlinks from old pages and websites that don't exist any more, but these backlinks still exist in the Majestic Historic index. I cleaned up the obvious useless links and passed the rest through Screaming Frog to check if those old pages/sites even exist. There are tons of link sending pages that return a 0, 301, 302, 307, 404 etc errors. Should I consider all of these pages as being bad backlinks and add them to the disavow file? Just a clarification, Im not talking about l301-ing a backlink to a new target page. Im talking about the origin page generating an error at ping eg: originpage.com/page-gone sends me a link to mysite.com/product1. Screamingfrog pings originpage.com/page-gone, and returns a Status error. Do I add the originpage.com/page-gone in the disavow file or not? Hope Im making sense 🙂
Technical SEO | | IgorMateski0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
How Does Google's "index" find the location of pages in the "page directory" to return?
This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory". The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.
Technical SEO | | reidsteven750 -
Miss meta description on 404 page
Hi, My 404 page did not have meta description. Is it an error? Because I run report and seomoz said that a problem. Thanks!
Technical SEO | | JohnHuynh0