Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No english Url = No sense symbols?
Hey there, i have a greek content website and some of the urls are greek (I did this for better SEO score).
Moz Pro | | tsalatzi
When i am using the analyze page issues and write down the greek url it doesnt find it (for example if i wrote down "www.euroulakia.com/πως-να-βγαλω-λεφτα" it displays me back "Sorry! We weren't able to find that page when we crawled your site") BUT when i just copy paste it from the url the moz finds it. However when i copy-paste the url changes the greek characters to no-sense symbols (for example the same above url becomes : http://www.euroulakia.com/πως-να-βγαλω-λεφτα) As you can see the url is written with non-sense symbols.. My question is if google see this no-sense symbol as well instead of the greek characters? I am using Joomla and i have: Search Engine Friendly URLs and Unicode Aliases setting to yes. Can anyone please help me with this because i have a feeling that something is wrong here. Thanks in advance0 -
Tools like Followerwonk for Facebook and G+
I'm a huge fan of Followerwonk. I was wondering if anyone could recommend a product or service similar to it, but for Facebook or G+. I know G+ may be a stretch, but I'm hopefully there's at least one for Facebook.
Moz Pro | | Oren.0 -
Site Explorer Results - Linking Domains Tab
I have a competitor who has the following Linking Root Domains numbers on the Linking Domains tab and was wondering if some one could explain what the numbers mean? opera.com - 205,515
Moz Pro | | bridgeway04-34677
dmoz.org - 97,752 prweb.com - 198,092 quantcast.com - 115,748 tinyurl.com - 443,540 jquery.com - 57,886 URL: http://www.opensiteexplorer.org/domains?site=www.verypdf.com%2F Thanks, Brad0 -
How to Refresh the Rankings Tool?
Is there a way to manually refresh the "Rankings" part of a campaign? I know about the Rank Tracker tool but didn't have a lot of keywords set up there. So I need to refresh the Rankings in the Campaign to know exactly which rankings have dropped in the past few hours... any way to do that?
Moz Pro | | abhim120 -
MozRank in Open Site Explorer?
Hi, I wondered why mozRank is not showing in OSE? As this is the "equivalent" of Google's page rank? Thanks
Moz Pro | | CallieGunstinson0 -
How to push negative product review sites down.
Hi Guys, One of my respected clients have some issues with negative product review sites coming up when they search their brand name on google. So for an exmaple, when I search for Company Name on google, the 3rd and 4th results are angry customer reviews. This is harming my clients brand so bad. My questions are, 1. What should I do to push these results down. I am happy to do pretty much ANYTHING to push these sites down. 2. I'm also thinking of doing a blog for this client for SEO purposes and wondering the pros and cons of having the blog as a subdomain vs subfolder. Which will help me to again, push the negative site links down. Thanks
Moz Pro | | Uds0 -
Tools that crawl 2 million page sites
Our site is about 2million pages deep, 50% of which is stale content. Yes, I know - OMG #unhygienic. Even if we get approval to get rid of half of it. SEOMoz Pro Elite only crawls 20k deep - what can i do to crawl and diagnose the whole site. Are there any tools anyone can suggest. SEOMoz??
Moz Pro | | ilhaam0 -
Best Keyword Ranking Report Tool
What's the best keyword ranking report tool that anyone is using currently? I have used CuteRank (booo), Rank Tracker, SEOBook.com Rank Tracker, etc. Cost isn't an issue - I'm looking for a free or a paid tool. I need to be able to export resutls in PDF and Excel format. Thanks for the feedback!
Moz Pro | | ezclickmedia0