Broken Links from Open Site Explorer
-
I am trying to find broken internal links within my site. I found a page that was non-existent but had a bunch of internal links pointing to that page, so I ran an Open Site Explorer report for that URL, but it's limited to 25 URLs.
Is there a way to get a report of all of my internal pages that link to this invalid URL? I tried using the link: search modifier in Google, but that shows no responses.
-
Whew! Big thread.
Sometimes, when you can't find all the broken links to a page, it's easier simply to 301 redirect the page to a destination of your choice. This helps preserve link equity, even for those broken links you can't find on large sites. (and external links, as well)
Not sure if this would help in your situation, but I hope you're getting things sorted out!
-
Jesse,
That's where I started my search, but GWMT wasn't showing this link. I can only presume that because it isn't coming back a 404 (it is showing that "We're Sorry" message instead) that they're considering that message to be content.
Thanks!
-
Lynn, that was a BIG help. I had been running that report, but was restricted to 25 responses. When I saw your suggestion to filter for only internal links, I was able to see all 127.
Big props. Thanks!
-
One more thing to add - GWMT should report all 404 links and their location/referrer.
-
oops! i did not know this. Thanks Irving.
-
Use the word FREE with an asterisk because sreaming frog is now limiting the free version to 500 pages. Xenu is better, even brokenlinkcheck.com lets you spider 3000 pages.
500 pages makes the tool practically worthless for any site of decent size.
-
Indeed if it is not showing a 404, that makes things a bit difficult!
You could try another way, use OSE!
Use the exact page, filter for only internal links, boom 127 pages that link to it. There might be more, but this should get you going!
-
Jesse:
I appreciate your feedback, but am surprised that the ScreamingFrog report found no 404s. SEOmoz found 15 in Roger's last crawl, but those aren't the ones that I'm currently trying to solve.
The problem page is actually showing up as duplicate content, which is kinda screwy. When visiting the page, our normal 404 error doesn't appear (which our developers are still trying to figure out), but instead, an error message appears:
http://www.gallerydirect.com/about-us/media-birchwood
If this were a normal 404 page, we'd probably be able to find the links faster.
-
I got tired of the confusion and went ahead and proved it. Not sure if this is the site you wanted results for, but I used the site linked in your profile (www.gallerydirect.com)
took me about 90 seconds and I had a full list... no 404s though
anyway here's a screenshot to prove it:
http://gyazo.com/67b5763e30722a334f3970643798ca62.png
so what's the problem? want me to crawl the fbi site next?
-
I understand. Thing is, there is a way and the spider doesn't affect anything. Like I said, I have screaming frog installed on my computer and I could run a report for your website right now and you or your IT department would never know it happened.. I just don't understand the part where the software doesn't work for you but to each their own i suppose.
-
Jesse:
That movie was creepy, but John Goodman was awesome in it.
I started this thread because I was frustrated that OSE restricts my results to 25 links, and I simply wanted to find the rest for that particular URL. I was assuming that there was either:
a. A method for getting the rest of the links that Roger found
b. Another way of pulling these reports from someone who already spiders them (since I can't get any using the link:[URL] in Google and Webmaster Tools isn't showing them).
Thanks to all for your suggestions.
-
run the spider based app from outside their "precious network" then. hell, i could run it right now for you from my computer at work if I wanted. Use your laptop or home computer. It's a simple spider you don't have to be within any network to run it. You could run one for CNN.com if you'd like as well...
-
How else do you expect to trace these broken links without using a "spider?"
Obviously it's the solution. And the programs take up all of like 8 megs... so what's the problem/concern?
I second the screaming frog solution. It will tell you exactly what you need to know and has ZERO risk involved (or whatever it is that's hanging you up). The bazooka comparison is ridiculous, because a bazooka destroys your house. Do you really think a spider crawl will affect your website?
Spiders crawl your site and report findings. This happens often whether you download a simple piece of software or not. What do you think OSE is? Or Google?
I guess what we're saying is if you don't like the answer, then so be it. But that's the answer.
PS - OSE uses a spider to crawl your site...
PPS - Do you suffer from arachnophobia? That movie was friggin awesome now I want to watch old Jeff Daniels films.
PPSS - Do you guys remember John Goodman being in that movie? Wow the late 80s early 90s were really somethin' special.
-
John, I certainly see your point, but our IT guys would not take too kindly to me running a spider-based app from inside their precious network, which is why I was looking for a less intrusive solution.
I'm planning on a campaign to revive "flummoxed" to the everyday lexicon next.
-
Hi Darin,
Both these softwares are made for exactly this kind of job and they are not huge system killing programs or anything. Seriously I use one or both almost every day. I suggest downloading them and seeing how you go, I think you will be happy enough with the results.
-
The way I see it, its much like you missing the last flight home, and you have a choice of getting the bus, that means you might take a little longer, or of course you can wait for the next flight ,which happens to be tomorrow evening, the bus will get you home that night.
I get the bus each and every time, I get home, later than expected I grant you, but I get home a lot quicker than waiting for the plane tomorrow.
Bewildered, I didn't realise it had fallen out of the diction, its a common word (I think) in Ireland, oh and I am still young (ish)
-
John:
Bewildered. There's a good word that I'm happy to see someone is keeping it alive for the younger generations.
I'm not ungrateful for your suggestions, but both involve downloading and installing a spider, which seems like overkill, much like using a bazooka to kill a housefly.
-
I am bewildered by this, I have told you one, Lynn has told you another piece of free software that will do this for you.
Anyway, good luck with however you resolve our issues
-
Lynn, part of the problem is definitely template-based, and one of our developers is working on that fix now. However, I also found a number of non-template created links to this page simply due to UBD error (an old Cobol programming term meaning User Brain Dead).
I need to find all of the non-template based, UBD links that may have been created and fix them.
-
Xenu will also do a similar job and doesn't have a limit which I recall the free version of screaming frog has: http://home.snafu.de/tilman/xenulink.html
If you have loads of links to this missing page it sounds like you maybe have a template problem with the links getting inserted on every or lots of pages. In that case if you find the point in the template you will have fixed them all at once (if indeed it is like this).
-
Darin
Its a stand alone piece of software you run, it crawls your website and finds out broken inbound, outbound or internal links, tells you them ,you go and fix them
Enter your URL, be it a page or directory, run it, it will give you all bad links. And it wont limit you to 25.
You don't need to implement anything ... run the software once, use it, and well bin it afterwards if you wish
But by all means, you can do as you suggest with SE ...
Regards
John
-
John,
While I could look at implementing such a spider to run the check sitewide on a regular basis, I am not looking to go that far at the moment. For right now, I'm only looking for all of the pages on my site that link to a single incorrect URL. I would have to think that there's a solution available for such a limited search.
If I have to, I suppose I can fix the 25 that Open Site Explorer displays, wait a few days for the crawler to run again, then run the report again, fix the next 25, then so on and so on, but that's going to spread the fix out potentially over a number of weeks.
-
Free tool, non SEO Moz related
http://www.screamingfrog.co.uk/seo-spider/ , run that, will find all broken links, where they are coming from etc etc
Hope I aint braking any rules posting it
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Splitting One Site Into Two Sites Best Practices Needed
Okay, working with a large site that, for business reasons beyond organic search, wants to split an existing site in two. So, the old domain name stays and a new one is born with some of the content from the old site, along with some new content of its own. The general idea, for more than just search reasons, is that it makes both the old site and new sites more purely about their respective subject matter. The existing content on the old site that is becoming part of the new site will be 301'd to the new site's domain. So, the old site will have a lot of 301s and links to the new site. No links coming back from the new site to the old site anticipated at this time. Would like any and all insights into any potential pitfalls and best practices for this to come off as well as it can under the circumstances. For instance, should all those links from the old site to the new site be nofollowed, kind of like a non-editorial link to an affiliate or advertiser? Is there weirdness for Google in 301ing to a new domain from some, but not all, content of the old site. Would you individually submit requests to remove from index for the hundreds and hundreds of old site pages moving to the new site or just figure that the 301 will eventually take care of that? Is there substantial organic search risk of any kind to the old site, beyond the obvious of just not having those pages to produce any more? Anything else? Any ideas about how long the new site can expect to wander the wilderness of no organic search traffic? The old site has a 45 domain authority. Thanks!
Intermediate & Advanced SEO | | 945010 -
Realtor site with external links in navigation
I have a client with a realtor site that uses IDX for the listings feed. We have several external links going over to the IDX site for various live custom searches (ie: luxury listings, waterfront listings, etc...). We are getting a Moz spam ranking of 2/7 for both "Large Number of External Links" and "External Links in Navigation". Chances are, these are related. My question is this: (1) Being the score is only 2/7, should I bother with fixing this? (2) If I add a rel="nofollow" to all the site-wide links (in header, footer & menu) will this help? I couldn't find anything definitive in the Q&A search. Looking forward to any insights!!!
Intermediate & Advanced SEO | | lcallander1 -
Dfferent url of some other site is shown by Google in cace copy of our site's page
Hi, When i check cached copy of url of my site http://goo.gl/BZw2Zz , the url in cache copy shown by Google is of some other third party site. Why is Google showing third party url in our site's cached url. Did any of you guys faced any such issue. Regards,
Intermediate & Advanced SEO | | vivekrathore0 -
Chinese Sites Linking With Bizarre Keywords Creating 404's
Just ran a link profile, and have noticed for the first time many spammy Chinese sites linking to my site with spammy keywords such as "Buy Nike" or "Get Viagra". Making matters worse, they're linking to pages that are creating 404's. Can anybody explain what's going on, and what I can do?
Intermediate & Advanced SEO | | alrockn0 -
301 from a defunct site due to great link profile
Hi there Would really appreciate your help in dealing with the following scenario: My client is an authority brand in their sector. They were bought end 2011, and a new website was launched under the new owner's brand. For whatever reason no 301 redirects were put in place from the old site to the new site. I am now auditing the new site and the traffic is pitifully low, way lower than they used to enjoy on the old site. The old site is defunct and Google is no longer indexing it. However OSE shows that the link profile of the old site was very good with thousands of good quality links, whilst it is non-existent for the new site. I am thinking that even though Google does not index the old site, we should try and get access and put 301s in place on the old pages to help transfer across all the link juice to boost the new site. Do you agree or am I missing something here? Will page rank be transferred across even though the old site is dead? What else could we do? Would change of domain in WMT help? Although how would that work for a defunct site? We should probably 301 anyway as it would be good to ensure that folk following all those links can find my client's new site, but it would be great if page rank flowed too! All ideas appreciated! Many thanks
Intermediate & Advanced SEO | | Chammy
Wendy0 -
Our quilting site was hit by Panda/Penguin...should we start a second "traffic" site?
I built a website for my wife who is a quilter called LearnHowToMakeQuilts.com. However, it has been hit by Panda or Penguin (I’m not quite sure) and am scared to tell her to go ahead and keep building the site up. She really wants to post on her blog on Learnhowtomakequilts.com, but I’m afraid it will be in vain for Google’s search engine. Yahoo and Bing still rank well. I don’t want her to produce good content that will never rank well if the whole site is penalized in some way. I’ve overly optimized in linking strongly to the keywords “how to make a quilt” for our main keyword, mainly to the home page and I think that is one of the main reasons we are incurring some kind of penalty. First main question: From looking at the attached Google Analytics image, does anyone know if it was Panda or Penguin that we were “hit” by? And, what can be done about it? (We originally wanted to build a nice content website, but were lured in by a get rich quick personality to rather make a “squeeze page” for the Home page and force all your people through that page to get to the really good content. Thus, our avenge time on site per person is terrible and Pages per Visit is low at: 1.2. We really want to try to improve it some day. She has a local business website, Customcarequilts.com that did not get hit. Second question: Should we start a second site rather than invest the time in trying to repair the damage from my bad link building and article marketing? We do need to keep the site up and running because it has her online quilting course for beginner quilters to learn how to quilt their first quilt. We host the videos through Amazon S3 and were selling at least one course every other day. But now that the Google drop has hit, we are lucky to sell one quilting course per month. So, if we start a second site we can use that to build as a big content site that we can use to introduce people to learnhowtomakequilts.com that has Martha’s quilting course. So, should we go ahead and start a new fresh site rather than to repair the damage done by my bad over optimizing? (We’ve already picked out a great website name that would work really well with her personal facebook page.) Or, here’s a second option, which is to use her local business website: customcarequilts.com. She created it in 2003 and has had it ever since. It is only PR 1. Would this be an option? Anyway I’m looking for guidance on whether we should pursue repairing the damage and whether we should start a second fresh site or use an existing site to create new content (for getting new quilters to eventually purchase her course). Brad & Martha Novacek rnUXcWd
Intermediate & Advanced SEO | | BradNovi0 -
Finding specific name to send email to: Broken Link Building
Hello, I am doing BLB (broken link buidling). I have sites to send emails to for a backlink, but what are all the tricks you know in finding the name of who to contact at these websites? Here's my initial email borrowed from a John Cooper in the comments of this article: http://www.seomoz.org/blog/broken-link-building-guide-from-noob-to-novice Hi! I just stumbled across a few broken links on the website, and I didn't know who to notify. Do you think you could help me out? Thanks 🙂
Intermediate & Advanced SEO | | BobGW0 -
Can a Hosting provider that also hosts adult content sites negatively affect our SEO rankings on a non-adult site hosted on same platform?
We're considering moving a site to a host that also offers hosting for adult websites. Can this have a negative affect on SEO, if our hosting company is in any way associated with adult websites?
Intermediate & Advanced SEO | | grapevinemktg0