Add `replicateonly` support #13

ramonmaruko · 2012-04-27T06:08:42Z

This patch adds what the above said. A storage node tracked as
replicate only won't be able to receive read/get requests from
the tracker.

Add writeonly device status
Add writeonly status to database
Replicate from writeonly devices
Rename writeonly to replicateonly
Revert "Replicate from writeonly devices"
This reverts commit 030f9a6.
Conflicts:
- lib/MogileFS/DeviceState.pm
- lib/MogileFS/Worker/Replicate.pm
Don't upload files to replicate only devices.

This patch adds what the above said. A storage node tracked as replicate only won't be able to receive read/get requests from the tracker. * Add writeonly device status * Add writeonly status to database * Replicate from writeonly devices * Rename writeonly to replicateonly * Revert "Replicate from writeonly devices" * This reverts commit 030f9a6. * Conflicts: * lib/MogileFS/DeviceState.pm * lib/MogileFS/Worker/Replicate.pm * Don't upload files to replicate only devices.

dormando · 2012-04-30T19:14:03Z

closed? Don't want us to merge this? :P

ramonmaruko · 2012-05-01T07:12:34Z

Closed because I wasn't confident enough. :D

Reopening. Please critique my code.

dormando · 2012-05-19T00:12:09Z

I'm going to have to come back to this... So, the problem a little more complex than this. If you're rebalancing files toward a "replicateonly" device, you end up reducing the avaialability of the files until the device is marked as alive again (or making them completely unavailable, in the cases of mindevcount=1).

So while that should be fine for a lot of people, it isn't something I can safely cut in the general release.

We could do something more complex, like... reorder the get_paths result so replicate-devices are at the back. Or just warn heavily about the availability problem.

or, we can somewhat split the difference and have a mode which only allows rebalance targets. A normal replication of a new or adjusted file wouldn't hit said device, but if you tried to target it via rebalance it would go in. Then we leave reads enabled as well.

I'll leave this request open for now as a reminder.

pyh · 2012-09-26T12:17:29Z

We had somewhat similar issue. When we added new storage nodes, those would start receiving all the new files, and after a couple of day those hosts would max out their bandwidth (96 drive nodes, with only 1Gbit nics).

We ended up writing a plugin that disallows writing to a defined set of hosts. It still allows replicated files to end up on those hosts, so we can run rebalance to fill the hosts with old, infrequently fetched files. The plugin also de-prioritizes the same set of hosts so files usually will not get fetched from them, this way we can control how much bandwidth the hosts use, while still maintaining availability of the files.

The plugin can be found from here: https://gist.github.com/2783746

I haven't tested it with the most recent release, we are still using 2.59, but I don't think there have been any changes that would break it.

By modifying the plugin it should be possible to disallow direct reads from the hosts, if that's needed.

ramonmaruko closed this Apr 27, 2012

ramonmaruko reopened this May 1, 2012

ramonmaruko closed this May 1, 2012

ramonmaruko reopened this May 1, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `replicateonly` support #13

Add `replicateonly` support #13

ramonmaruko commented Apr 27, 2012

dormando commented Apr 30, 2012

ramonmaruko commented May 1, 2012

dormando commented May 19, 2012

pyh commented Sep 26, 2012

Add replicateonly support #13

Are you sure you want to change the base?

Add replicateonly support #13

Conversation

ramonmaruko commented Apr 27, 2012

dormando commented Apr 30, 2012

ramonmaruko commented May 1, 2012

dormando commented May 19, 2012

pyh commented Sep 26, 2012

Add `replicateonly` support #13

Add `replicateonly` support #13