-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle cases where signals are missed. #12
base: master
Are you sure you want to change the base?
Handle cases where signals are missed. #12
Conversation
@stephenworsley It's not really easy to determine whether this is an improvement or not. Without testing can you offer any evidence that this is a good PR? |
@bjlittle This came up because I was testing Kapture with a file that had a long running call where for about 20 seconds it was not logging. On the resulting graph, this call only took up a small amount of space. This seems to me to be a tad misleading. With my fix, this call appeared on the graph as appropriately wide. |
To avoid having to change the sampling pattern, have you considered adding the time-since-last-signal to the |
@rhattersley I'm not sure that would get around the need to change the sampling pattern. If signals are being handled immediately, I don't think there should be any difference in the weight given to one which was handled ~0 seconds since the last handled signal and one which was handled at ~2 seconds since. I only want to add weight when the time since the last handled signal is sufficiently large that it is necessary that another signal was sent, but was ignored. The thought behind changing the sampling pattern is that, with my pattern, the difference in time between two consecutive signals being sent is always less than the difference in time between, say, the j-th signal and the j+2th signal. As it is currently, the first value can be as large as 2 and the second value as small as 1. I suppose it would still be possible to approximate the number of missed signals, but with a tendency to underestimate by 1 (maybe something like half of the time). Whether this weighting took place before or after logging, I feel like the issue is still tied to the sampling pattern. |
Completely agree - hence the original implementation. 👍 And I understand why inferring the number of lost signals from the previous & current signal times, requires that the possible signal intervals can't overlap. My only concern is that changing the sampling interval to [0.5, 1] might exchange one sampling bias for another. The original jitter pattern is deliberately designed to avoid bias in exchange for increased noise, so if you run for longer it will converge to an accurate answer. On balance I suspect the new trade-off in this PR is worth it. |
For future reference, a couple of (totally not thought through!) ideas that probably aren't worth exploring right now:
|
As described here: https://docs.python.org/3/library/signal.html
Because of this, it is possible that many signals will be recieved but a log will be written only once. This pull attempts to fix this by estimating how many signals should have been recieved since the last one was handled. In order to get a more accurate calculation, the way signals are jittered has also been changed.