Any tools for scraping blogroll URLs from sites?
-
This question is entirely in the whitehat realm...
Let's say you've encountered a great blog - with a strong blogroll of 40 sites.
The 40-site blogroll is interesting to you for any number of reasons, from link building targets to simply subscribing in your feedreader. Right now, it's tedious to extract the URLs from the site. There are some "save all links" tools, but they are also messy.
Are there any good tools that will
a) allow you to grab the blogroll (only) of any site into a list of URLs (yeah, ok, it might not be perfect since some sites call it "sites I like" etc.)
b) same, but export as OPML so you can subscribe.
Thanks!
Scott
-
Not at all. I guess my feeling here is that there is a sort of untapped social graph defined by blogrolls. If it were simple to harvest them upon visiting a blog (e.g. this blogger recommends...) one could do a stumble-on-steroids approach to a niche.
-
I thought you might be able to use the outbound link scraper to grab the outbound link onto the page. Pop in your URLS of the pages you want to scrape and it will spit out our a list of those domaind and urls. You can take those urls and put them into the contact finder and it will return the contact details for those sites. Combine the two spreadsheets for an epiuc list of blogs to contact for your outreach.
This is obviously for link building rather than subscribing - sorry if I have misunderstood what you were trying to do
-
Hi Keri,
That is a very cool tool, but is overkill for this. It takes far too many steps to accomplish only part of the desired goal of grabbing all blogroll URLs (within the blogroll DIV tag) and exporting the list to a valid OMPL file or URL list.
thanks!
-
nothing I saw there would do this. It looks like it could manage to list all external links, and I suppose you could manually pick the blogroll out of it.
-
Hi there,
Well, Keris response reminded me of this question and the fact that I found a tool for scraping these kind of lists:
Here it is (with some other cool tools) , have fun:
-
Hi Scott,
I'm going through older questions. Did you ever find a tool to do what you wanted to do here?
-
One thing to look at is Outwit Hub for Firefox. It might be able to help with that. It can scrape data from a page and do a lot with it. http://www.outwit.com/products/hub/. Don't know that it meets all of your needs, but I also haven't seen a response with anything better at the moment.
-
Hey Scott,
What a great question and <sigh>I don't have the answer. I am going to back to find out what people come up with here. Surely there is someone that lurks these parts that can throw something together?</sigh>
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WWW used in research URL, or not to WWW
Long time user, infrequent poster.... thanks for taking my question... When I go to gather a series of data elements on a company's URL, the data changes (sometime dramatically) depending on whether the 'www.' is added to the URL & it seems related more to Page data than Domain. My question is about which data I should be using to assess the real strength of the site / page? Is there a 'best practice' question here, a personal preference or is there an actual difference in the performance of the www vs the non-www version? aquGYdz
Moz Pro | | SWGroves0 -
Links being reported in Webmaster Tools
Hi Are the Total Links To Your Site, as reported in GWT, purely external inbound links ? Since these links are usually, as far as i can tell, much higher in number than any other link reporting tool and hence, i presume, more accurate, why don't services such as Moz etc include this in reporting ? I know its just a total number and link quality is whats important not quantity, but i would have thought interesting to show in reporting in conjunction with link quality info such as is already reported. Since most backlink reporting tools do show a total but always much much lower than that reported in gwt (i think) All Best Dan
Moz Pro | | Dan-Lawrence0 -
403 error for a member site
Perhaps a stupid question but SEOmoz registers 403 errors for pages behind a membersite (ie. they are restricted on purpose). Should I noindex these pages or just let SEOmoz register these "errors"?
Moz Pro | | Crunchii0 -
Open Site Explorer
Hi - open site explorer always showed backlinks to my site's blog for Wordpress, a couple of edu sites and others that ranked somewhat high. These are no longer being shown. However, they do show up on some other backlink check tools. Does anyone know why and if this should be a concern? Thanks. Don www.uniquegiftsanddecor.com
Moz Pro | | uniquegifts-2778790 -
In Site Explorer My Blog.URL.com Shows "No Data Available for this URL"
Why when I use http://www.opensiteexplorer.org and I'm researching our Blog.URL.com's does the tool say "No Data Available for this URL"? Example: http://www.opensiteexplorer.org/links?site=blog.centurypayments.com
Moz Pro | | cfield_splashmedia.com0 -
On page optimisation tool issues
When viewing my campaign and looking at the on page optimisation tool, I have a few issues. I seems to only shows the keywords I want rankings for and how optimised my homepage is for those keywords. Is there any way I can get it to analyse permanently specifc keywords for specific pages because my homepage isnt optimised for some keywords which are on my list, which I have optimised other pages for, and because its looking at my homepage its getting a really low grade, and looks really bad and frustrates me because I cant work this out. Any help greatly appreciated.
Moz Pro | | CompleteOffice1 -
The "Social Media Monitoring" tool in research tools
I used this tool once and noticed it provided nice links to where listed keywords were posted in articles, forums, blogs, etc..., since that time @3 weeks ago... it lists an "error" to links and/or doesn't provide them in a linkable list below to access. Is this tool being worked on or will it be available in the future? Thank you.
Moz Pro | | tylersmcc0 -
Page Rank and offline sites
I have a domain with PR6 according to the Historical Pagerank Checker. But that last PR was calculated 2 years ago. I brought the site back online a few days ago and have checked that many/most of the backlinks are still valid. It is now in the Google index but the Historical Pagerank Checker shows PR0. Will it get back its previous rank or something close to it? How long will it take?
Moz Pro | | DomainOptions0