-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hadoopy backend? #26
Comments
I would help with this if there is interest. The purpose of Hadoopy isn't to recreate this functionality, it is to create a thin core python interface for streaming. I use whirr and oozie for cluster and job management respectively (Hadoopy is designed to be compatible with these tools). I can see more casual users not wanting to use these more powerful but complex tools, opting for a more integrated approach. There are a few things we need to take into account.
|
I'd definitely be interested and I'd be happy to review code or help out with figuring out how to hook things up or so. As I'm pretty busy these days I probably won't be able to help with the actual coding though, but it looks like we might already have enough manpower to get something done I guess. So bring on the code -- I look forward to having a look at it and trying it out.. :) |
Okay, this sounds like something worth pursuing. (At least, I would really like it. I had to switch back to dumbo for some last minute tests in a paper recently because I needed some of the libegg/libjar/etc. features.) One question: Would you need to dual license it if dumbo just used it as a black-box backend? (I am not up to speed on how python's "import" acts with respect to licenses.) I agree that this is the cleanest approach. |
Not sure about the licensing either, but surely we could figure something out... |
Hey, I really love the job management stuff in dumbo. However, it seems like the inner-core of hadoopy is more highly optimized. (I get a factor of 2 better performance in my tests.) So it seems to me like the right way of combining the two is to write a hadoopy backend for dumbo. Would this be something you'd be interested in adding to dumbo? I'm happy to work on it in some capacity if there is interest.
The text was updated successfully, but these errors were encountered: