-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add replicateonly
support
#13
base: master
Are you sure you want to change the base?
Conversation
This patch adds what the above said. A storage node tracked as replicate only won't be able to receive read/get requests from the tracker. * Add writeonly device status * Add writeonly status to database * Replicate from writeonly devices * Rename writeonly to replicateonly * Revert "Replicate from writeonly devices" * This reverts commit 030f9a6. * Conflicts: * lib/MogileFS/DeviceState.pm * lib/MogileFS/Worker/Replicate.pm * Don't upload files to replicate only devices.
closed? Don't want us to merge this? :P |
Closed because I wasn't confident enough. :D Reopening. Please critique my code. |
I'm going to have to come back to this... So, the problem a little more complex than this. If you're rebalancing files toward a "replicateonly" device, you end up reducing the avaialability of the files until the device is marked as alive again (or making them completely unavailable, in the cases of mindevcount=1). So while that should be fine for a lot of people, it isn't something I can safely cut in the general release. We could do something more complex, like... reorder the get_paths result so replicate-devices are at the back. Or just warn heavily about the availability problem. or, we can somewhat split the difference and have a mode which only allows rebalance targets. A normal replication of a new or adjusted file wouldn't hit said device, but if you tried to target it via rebalance it would go in. Then we leave reads enabled as well. I'll leave this request open for now as a reminder. |
We had somewhat similar issue. When we added new storage nodes, those would start receiving all the new files, and after a couple of day those hosts would max out their bandwidth (96 drive nodes, with only 1Gbit nics). We ended up writing a plugin that disallows writing to a defined set of hosts. It still allows replicated files to end up on those hosts, so we can run rebalance to fill the hosts with old, infrequently fetched files. The plugin also de-prioritizes the same set of hosts so files usually will not get fetched from them, this way we can control how much bandwidth the hosts use, while still maintaining availability of the files. The plugin can be found from here: https://gist.github.com/2783746 I haven't tested it with the most recent release, we are still using 2.59, but I don't think there have been any changes that would break it. By modifying the plugin it should be possible to disallow direct reads from the hosts, if that's needed. |
This patch adds what the above said. A storage node tracked as
replicate only won't be able to receive read/get requests from
the tracker.