Improve resiliency of performance calculation requests #125

tsunyoku · 2024-07-16T19:33:22Z

Server has had some instability recently and made me realise that this isn't even really trying to be resilient.. I've added tenacity retry logic onto the performance usecases and opted to use raise_for_status instead of returning zeros on a failure - this seemed like odd and unintuitive behavior to me, even with the logging

…ise_for_status`

cmyui · 2024-07-16T19:57:05Z

hmm, i think this (& the previous code) lacks consideration that the osu client retries score submission, which is a part of the resiliency system.. i think setting pp=0 without a strategy to calculate these is worse than hard failing tbh

cmyui · 2024-07-16T19:58:19Z

app/usecases/performance.py

-            extra={"status": response.status_code},
-        )
-        return [(0.0, 0.0)] * len(scores)
+    response.raise_for_status()


i think i agree with moving to response.raise_for_status() to hard fail requests rather than setting 0

tsunyoku · 2024-07-16T20:00:01Z

hmm, i think this (& the previous code) lacks consideration that the osu client retries score submission, which is a part of the resiliency system.. i think setting pp=0 without a strategy to calculate these is worse than hard failing tbh

my concern with this is that it has the implication that if performance-service never comes back up that the score will never submit. i think it would be better to have a score submitted with zero pp than no score

cmyui · 2024-07-16T20:01:09Z

hmm, i think this (& the previous code) lacks consideration that the osu client retries score submission, which is a part of the resiliency system.. i think setting pp=0 without a strategy to calculate these is worse than hard failing tbh

my concern with this is that it has the implication that if performance-service never comes back up that the score will never submit. i think it would be better to have a score submitted with zero pp than no score

Yeah, ultimately a deadletter queue for performance calculations would be a nice solution here. Does the processor on performance service already support us building such a queue?

tsunyoku · 2024-07-16T20:01:13Z

i would hope that logging the score id on failure would make it easier to create a strategy to recalculate scores in this case

but having an asynchronous score processing model like i've discussed before would solve all of these concerns i think, since we take separate ownership of calculating performance outside of the hot path (the actual score submission) so we can retry as much as we like (without relying on the client to retry) solong as we have the initial score data. lazer does this and i think it works well

tsunyoku · 2024-07-16T20:01:58Z

Yeah, ultimately a deadletter queue for performance calculations would be a nice solution here. Does the processor on performance service already support us building such a queue?

not really, performance-service is intentionally detached from a concept of "this score has xpp" and rather taking in statistics and spitting out a value. i don't really know if i want to change that relationship either

cmyui · 2024-07-16T20:06:47Z

Yea I think I agree that keeping it on score-service would be nice.

having an asynchronous score processing model like i've discussed before would solve all of these concerns i think, since we take separate ownership of calculating performance outside of the hot path (the actual score submission) so we can retry as much as we like

Yea imo this would likely look like a main perf calculation queue + a deadletter queue for failed requests (so that the main queue is not blocked by a single broken score); both on score service that depend on perf-service and the db

tsunyoku · 2024-07-16T20:08:10Z

yep i agree, i think this is a "long term consideration" though

Use tenacity for performance usecases, replace 0 returns with a `ra…

cd4d7c9

…ise_for_status`

tsunyoku requested review from cmyui and infernalfire72 as code owners July 16, 2024 19:33

cmyui reviewed Jul 16, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve resiliency of performance calculation requests #125

Improve resiliency of performance calculation requests #125

tsunyoku commented Jul 16, 2024

cmyui commented Jul 16, 2024 •

edited

Loading

cmyui Jul 16, 2024

tsunyoku commented Jul 16, 2024

cmyui commented Jul 16, 2024

tsunyoku commented Jul 16, 2024

tsunyoku commented Jul 16, 2024

cmyui commented Jul 16, 2024

tsunyoku commented Jul 16, 2024

Improve resiliency of performance calculation requests #125

Are you sure you want to change the base?

Improve resiliency of performance calculation requests #125

Conversation

tsunyoku commented Jul 16, 2024

cmyui commented Jul 16, 2024 • edited Loading

cmyui Jul 16, 2024

Choose a reason for hiding this comment

tsunyoku commented Jul 16, 2024

cmyui commented Jul 16, 2024

tsunyoku commented Jul 16, 2024

tsunyoku commented Jul 16, 2024

cmyui commented Jul 16, 2024

tsunyoku commented Jul 16, 2024

cmyui commented Jul 16, 2024 •

edited

Loading