-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve resiliency of performance calculation requests #125
base: master
Are you sure you want to change the base?
Conversation
hmm, i think this (& the previous code) lacks consideration that the osu client retries score submission, which is a part of the resiliency system.. i think setting pp=0 without a strategy to calculate these is worse than hard failing tbh |
extra={"status": response.status_code}, | ||
) | ||
return [(0.0, 0.0)] * len(scores) | ||
response.raise_for_status() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i agree with moving to response.raise_for_status() to hard fail requests rather than setting 0
my concern with this is that it has the implication that if |
Yeah, ultimately a deadletter queue for performance calculations would be a nice solution here. Does the processor on performance service already support us building such a queue? |
i would hope that logging the score id on failure would make it easier to create a strategy to recalculate scores in this case but having an asynchronous score processing model like i've discussed before would solve all of these concerns i think, since we take separate ownership of calculating performance outside of the hot path (the actual score submission) so we can retry as much as we like (without relying on the client to retry) solong as we have the initial score data. lazer does this and i think it works well |
not really, performance-service is intentionally detached from a concept of "this score has xpp" and rather taking in statistics and spitting out a value. i don't really know if i want to change that relationship either |
Yea I think I agree that keeping it on score-service would be nice.
Yea imo this would likely look like a main perf calculation queue + a deadletter queue for failed requests (so that the main queue is not blocked by a single broken score); both on score service that depend on perf-service and the db |
yep i agree, i think this is a "long term consideration" though |
Server has had some instability recently and made me realise that this isn't even really trying to be resilient.. I've added tenacity retry logic onto the performance usecases and opted to use
raise_for_status
instead of returning zeros on a failure - this seemed like odd and unintuitive behavior to me, even with the logging