Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequently written files may get debounced indefinitely #146

Closed
ThomWright opened this issue Jan 7, 2018 · 17 comments
Closed

Frequently written files may get debounced indefinitely #146

ThomWright opened this issue Jan 7, 2018 · 17 comments
Labels
A-enhancement Z-needs implementation Needs an implementation, will accept PRs
Milestone

Comments

@ThomWright
Copy link

If I have a file which consistently gets written to more frequently than the debounce timeout, then the debounced watcher seems to never emit a Write event.

What do you think about the idea of a max timeout option, where a Write will always be emitted after a maximum time following a NoticeWrite?

Or is this something I should implement myself as a client of the library? In which case, maybe it's worth noting it in the docs?

I'm happy either way, it was just something which caught me out 😀

@passcod
Copy link
Member

passcod commented Jan 7, 2018

This may be supported under vNext, but won't be added to v4, so if you want it soonish you'll have to implement it yourself for now.

I'd accept a PR implementing this for the current branch.

@passcod passcod added this to the 5.0.0 milestone Jan 7, 2018
@passcod passcod added A-enhancement Z-needs implementation Needs an implementation, will accept PRs labels Feb 9, 2019
@passcod passcod removed this from the Next milestone Feb 9, 2019
@sjoshid
Copy link
Contributor

sjoshid commented Feb 18, 2019

I would love to have this feature. Notify is used by Xi Editor and Im working on Tail Toggle issue. In my use case, tailing happen so fast, I get NoticeWrite event at the beginning and one Write event at the very end. Im using a very simple script (unending for loop with counter) that appends to a file im tailing. Solution by @ThomWright will be a good start.
But I have few questions (I apologize if this is not the correct forum to ask)

  • When running in raw mode, I get, a lot if not all Write events. So can I not change debounced mode to raw mode just when Im tailing a file and revert back?
  • About the issue Im working on, currently the debounced watcher has delay setting of 100 ms which does not emit Write event. But when I bring delay down to 5 ms, I do get Write event. So my question is: can i not change delay just for time being and revert? Is there a recommended delay setting for each platform? If not, how do we determine the delay?

@dfaust
Copy link
Member

dfaust commented Feb 18, 2019

@sjoshid:

  • The deboucer is built on top of the raw watcher, it's meant to be a higher level interface. The idea is to have a high level interface that's useful to most users and a low level interface for those with more specific needs.
  • Maybe we could add an additional message like OngoingWrite that is emitted periodically for a file until the write is finished. Maybe we should even add a separate delay for that.

What do you think, @passcod?

@passcod
Copy link
Member

passcod commented Feb 18, 2019 via email

@sjoshid
Copy link
Contributor

sjoshid commented Feb 19, 2019

Thank you, guys. It is clear to me know. I like the idea of adding OngoingWrite. If @ThomWright is ok, I would like to give this a shot.
@ThomWright let me know if you have made some progress here or if need help with anything.

@ThomWright
Copy link
Author

I haven't looked at this at all, or even thought about it since raising this issue. Feel free to explore some solutions!

@sjoshid
Copy link
Contributor

sjoshid commented Feb 19, 2019

Ok.
@dfaust and @passcod any pointers to get this started? I THINK I can do it but my knowledge with Notify is very less.

@sjoshid
Copy link
Contributor

sjoshid commented Feb 20, 2019

@passcod
I should have asked you since you are the owner of the lib. Which option do you think we need to go with?

  1. The OngoingWrite one or
  2. The complementary one you were suggesting. If we decide to go with this can you provide a little more details?

@passcod
Copy link
Member

passcod commented Feb 20, 2019

Oh, both of us is good. I was just busy yesterday. The OngoingWrite one. The second one can be done later (what I meant with "complementary"), if someone/I feels like it.

@sjoshid
Copy link
Contributor

sjoshid commented Feb 23, 2019

@passcod
I played around with notify and understood some of the important details on its inner workings. Except for one: who is inserting records in operations_buffer? I should have figured it out by now or I never will. :(

Correct me if Im wrong but here's my understanding so far:

  • ops_buf has unique tuple with timer_id. As the name suggests this is a buffer of events occurred on watched path. Side question: Can ops_buf have user defined events like NoticeWrite?
  • ops_buf events are tied to Scheduled event by the timer_id. This timer_id is used to fire or postpone the event by delay.

Approach
ops_buf currently doesnt have user defined event like NoticeWrite. But it doesnt mean it can't. Im planning to utilize this ops_buf for our new event OngoingWrite which will be timestamped to have its own delay.

There are other details I haven't hashed out yet (like, when should OngoingWrite be stopped). I will do that once we decide on approach.

@dfaust
Copy link
Member

dfaust commented Feb 23, 2019

@sjoshid

who is inserting records in operations_buffer?

It's quite subtle, there are a couple op_buf.entry(...).or_insert(...) calls in Debounce::event.

The operations_buffer could really use a better documentation. I'll try:

pub type OperationsBuffer = Arc<Mutex<HashMap<PathBuf, (Option<op::Op>, Option<PathBuf>, Option<u64>)>>>;

path => (op, from_path, timer_id)

operations_buffer holds the paths of all files for which events are currently debounced, together with their most significant event (op) and for rename events also their original path (from_path) and their timer id (timer_id).

Most significant event means for example that a newly created file that is written to within the debounce time frame will still only hold the Create event. While a file that was written to and then deleted will hold the Remove event.

When a rename event is detected, the original path is stored in from_path.

The timer_id is used to restart or ignore a scheduled event in case a more significant event is received for a file.

@sjoshid
Copy link
Contributor

sjoshid commented Feb 23, 2019

Thanks for explanation, @dfaust.
op_buf.entry(...).or_insert(...) I saw these calls but the or_insert inserts only (None, None, None). Isn't there something else that inserts the actual data? Otherwise *operation will always hit the None case.

Most significant event means for example that a newly created file that is written to within the debounce time frame will still only hold the Create event. While a file that was written to and then deleted will hold the Remove event.

This logic is done in Debounce.event() I believe.

Btw what do you think of the approach? Do you have something else in mind?

@dfaust
Copy link
Member

dfaust commented Feb 23, 2019

Isn't there something else that inserts the actual data? Otherwise *operation will always hit the None case.

If the None case is hit, the operation is updated with *operation =.

This logic is done in Debounce.event() I believe.

Yes.

Btw what do you think of the approach? Do you have something else in mind?

I would have to think about that myself for a while and I don't have much time at the moment. But here is an idea:
Add a buffer for NoticeWrite events (notice_write_buffer) to WatchTimer and add an entry every time a Write event is scheduled, then schedule the NoticeWrite. If the Write event is fired, remove the NoticeWrite event, if the NoticeWrite is fired, re-schedule it.
For that to work you might need to move the restart_timer method to WatchTimer so the timer id for the file doesn't change.

Another question is, should NoticeWrite events be fired for files that have been newly created and are being written to continuously? If so, then that has to be handled as well.

@sjoshid
Copy link
Contributor

sjoshid commented Feb 23, 2019

If the None case is hit, the operation is updated with *operation =.

Only Lord knows how I missed that. And thank you. Will think about approach in more details.

@sjoshid
Copy link
Contributor

sjoshid commented Mar 2, 2019

@dfaust @passcod
Im storing new config OnGoingWriteDelay in WatchTimer struct as it currently holds the delay. Is this the right location?
Editing my comment. I was wrong.
We can shove down the new delay down watcher's channel as a new EventLoopMsg. From there on it is can be taken all the way down to WatchTimer like so:
EventLoop -> event_tx -> Debounce -> timer.

@sjoshid
Copy link
Contributor

sjoshid commented Mar 31, 2019

@passcod
I see that this will be included in 5.0. Is there a release date yet?

@passcod
Copy link
Member

passcod commented Mar 31, 2019

Mid-april

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-enhancement Z-needs implementation Needs an implementation, will accept PRs
Projects
None yet
Development

No branches or pull requests

4 participants