A suggestion to help with linkscape crawling and data processing
-
Since you guys are understandably struggling with crawling and processing the sheer number of URLs and links, I came up with this idea:
In a similar way to how SETI@Home (is that still a thing? Google says yes: http://setiathome.ssl.berkeley.edu/) works, could SEOmoz use distributed computing amongst SEO moz users to help with the data processing? Would people be happy to offer up their idle processor time and (optionally) internet connections to get more accurate, broader data?
Are there enough users of the data to make distributed computing worthwhile?
Perhaps those who crunched the most data each month could receive moz points or a free month of Pro.
I have submitted this as a suggestion here:
http://seomoz.zendesk.com/entries/20458998-crowd-source-linkscape-data-processing-and-crawling-in-a-similar-way-to-seti-home -
Sean - I share Rand' sentiments, thanks so much for the suggestion!
We have considered distributed crawling in the past (or even distributed rank checking because then it would be in that user's locale) but there are a whole different set of challenges. For example, you have to handle all the edge cases: what if a user's computer isn't on, or loses connectivity, what if we crawl too fast and the user gets blocked from a site, how do you write all that data securely?
Of course all of these concerns can be overcome, but right now we feel like we have a good handle on the problems, and it will be much faster for us to just fix what we have
Although, I know all of us are so appreciative of the ideas and support, and we will have something really great soon!
-
Thanks a ton Sean! We have considered distributed computing as a way to help crawl, index, process, etc. It's so flattering and humbling to hear that you'd be willing to help out and that the community would, too
For now, we believe we can get to the index size/quality/freshness using our hosted system, but the engineering team will certainly be encouraged to hear that folks in our community might contribute to this. Distributed systems present their own challenges, and we'd have to write that code from scratch, but if we find that we can't do what we want with our existing network, we might reach out.
BTW - I wanted to let folks know that the team here does feel very confident that come December/January, we're going to be producing indices that reach exceptional quality bars. The problems we face are largely known, and we now have the team and the solutions to tackle it, so we're pretty excited.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Problems with Crawl Diagnostics/New Campaign
Hi all I added a new domain to my campaigns yesterday and got the usual message saying we will have a small set of results ready in a couple of hours, and the rest will be done in a week or so. 24 hours later, this same message is still showing. Anyone else experiencing this or is it just related to my domain? Many thanks. Carl
Moz Pro | | GrumpyCarl0 -
Cancel my PRO account and process refund, please
Hello SEOMoz folks, I seem to have a weird situation here, and need your help resolving it. The thing is that I received a code for a free 3 month SEOMoz Pro account that I gladly signed up for (who wouldn't considering how awesome SEOMoz is!) That said, I disbanded the startup that I was planning to use the account for several months ago. My 3 month eval ended today and my card was charged $99 because of that. The crux of my problem is that I ought to have cancelled the account long ago which I didn't, and I'e been charged. This is obviously my mistake, but I fervently hope that you guys are able to understand where I'm coming from and help with reversing the charges. To be honest, $99 means a lot to me and I'd very much appreciate if you're able to help me with a refund. I'll continue to be an avid SEOmoz fan (and prospective Pro user) but at this time, I just don't have a need to use the service. Thanks a ton! Siddharth
Moz Pro | | jugaadu0 -
How can I cancel a running crawl test?
I put in two urls that were incorrect and now I need to cancel the report generation. Is there a way to do this? And if so, would I get my crawl-credits back? Are they cumulative?
Moz Pro | | krenerr0 -
MozTrust suddenly dropped to zero. Link data now unavailable???
I'm running an SEOMoz campaign for a small site to monitor some tweaks I've made and testing new things out. Over a year or so the changes were great - organic traffic was rising, domain metrics were too. Then in October 2012 domain trust, Moztrust and all other link metrics went to zero. There's no data in OSE for this particular small site and as far as I can tell, no impact on search. The limited number of inbound links all appear to be intact and there's no reason why the site would be hit by Panda/Penguin style updates. Why has this sudden loss of data occurred? Is the site no longer in SEOMoz's databases? Without any link data it's difficult to tell why this has happened. If it is no longer in the Moz database, how do I get it back - that data was useful!
Moz Pro | | StevieCC0 -
How often does SEOMoz refresh their link analysis data?
I know that my link profile has changed within the last month, yet both SEOMoz campaign link analysis and Open Site Explorer have shown the exact same number of links for more than a month. Majestic SEO fluctuates as I would expect, but Moz data is unchanged.
Moz Pro | | Aggie0 -
20000 site errors and 10000 pages crawled.
I have recently built an e-commerce website for the company I work at. Its built on opencart. Say for example we have a chair for sale. The url will be: www.domain.com/best-offers/cool-chair Thats fine, seomoz is crawling them all fine and reporting any errors under them url great. On each product listing we have several options and zoom options (allows the user to zoom in to the image to get a more detailed look). When a different zoom type is selected it adds on to the url, so for example: www.domain.com/best-offers/cool-chair?zoom=1 and there are 3 different zoom types. So effectively its taking for urls as different when in fact they are all one url. and Seomoz has interpreted it this way, and crawled 10000 pages(it thinks exist because of this) and thrown up 20000 errors. Does anyone have any idea how to solve this?
Moz Pro | | CompleteOffice0 -
Has anyone else not had an SEOmoz crawl since Dec 22?
Before the holidays, I completed a website redesign on an eCommerce website. One of the reasons for this was duplicate content. The new design has removed all duplicate content. (Product descriptions on 2 pages) I took a look at my Crawl Diagnostics Summary this morning and this is what I saw: Last Crawl Completed: Dec. 15th, 2011 Next Crawl Starts: Dec. 22nd, 2011 Thinking it might have something to do with the holidays. Although, I would like to see this data as soon as possible. Is there a way I can request a crawl from seomoz ? Thanks, John Parker
Moz Pro | | JohnParker27920 -
Is The Crawl Diagnostic tool working correctly?
The Crawl Diagnostic tool shows issues and displays a graph but they don't display the page specific results/suggestion like it used to. I get the "Congratulations, there are no pages affected by this issue!" message.
Moz Pro | | -PAUL-0