Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload did not work for Finish last night #62

Open
ice1e0 opened this issue Nov 15, 2024 · 12 comments
Open

Upload did not work for Finish last night #62

ice1e0 opened this issue Nov 15, 2024 · 12 comments

Comments

@ice1e0
Copy link

ice1e0 commented Nov 15, 2024

Finish was uploaded yesterday eve

image

But it was not fully processed this morning, so, it looks like that the processing broke. I have uploaded it today again and it worked now.

image

The messages are taken from the Telegram channel "BCCM - BMM Exporters".

@ice1e0
Copy link
Author

ice1e0 commented Nov 15, 2024

The first upload of finish worked:
image

My current guess is that no writing checks are used and one writing operation deleted the information from the other. At the same time, Khasi was uploaded to BMM according to the Telegram channel:

image

@ice1e0
Copy link
Author

ice1e0 commented Nov 15, 2024

@ice1e0
Copy link
Author

ice1e0 commented Nov 15, 2024

P.s. it could have happened that the BMM update/processing happened at the same time as titles where updated in the old BMM site

@kkuepper
Copy link
Member

@KillerX can you give an update about the findings of your research?

@ice1e0
Copy link
Author

ice1e0 commented Nov 24, 2024

I have a new case from today:

image

Hungarian was uploaded around 18.20 and the user confirmend that he got an email that it was uploaded. But when I send the status message at 21:17 o'clock, it was not uploaded.

Note that I updated the titles via the BMM Admin website tonight. I cannot guarantee that this happened exactly during that time. My current guess is that the status got somehow overwritten.

My current suggestion to potentially fix this would be to use a concurrency control tactic and let at least one update command fail when updating data in the database. Then nothing gets overwritten and the user can react on the error message.
@kkuepper @KillerX

@KillerX
Copy link
Member

KillerX commented Nov 25, 2024

Original (fin - 115007) - This was a timeout in my side. We are working on a restructuring of the infra locally that should fix this cases, planned 2nd weekend in Jan 2025.

Everything else: I am unable to see any errors on my side. I did a closer investigation of the HU 115013 from yesterday, but I am unable to see any issues on MB side. All the flows completed w/o errors and the output I can see seems correct.

@ice1e0
Copy link
Author

ice1e0 commented Nov 26, 2024

@KillerX i do not think there is a mayor issue when a system runs into a timeout. The mayor issue i see here is that the user do not get the feedback that something went wrong. The telegram chat shows that everything went fine and the user was informed by email that the workflow worked. That should not happen.

@KillerX
Copy link
Member

KillerX commented Nov 26, 2024

@KillerX i do not think there is a mayor issue when a system runs into a timeout. The mayor issue i see here is that the user do not get the feedback that something went wrong. The telegram chat shows that everything went fine and the user was informed by email that the workflow worked. That should not happen.

On the first issue reported the system never sent any messages that the process completed! It also makes sense that no error message was ever sent in that case as the backoff became to large and it was waiting for next try (a few days later). This later part (too large backoff) has been tweaked after this.

Issue 2,3 are different, as stated in the previous comment.

@kkuepper
Copy link
Member

I looked at the logs for the issue with hungarian (track_115013):
mediabanken last updated at 18:20:24 and Leo updated it 6 seconds later at 18:20:30 accidentally deleting hungarian again.
I guess one solution would be to show an error message to Leo "there's been changes in the meantime and we couldn't save the changes"

@KillerX
Copy link
Member

KillerX commented Nov 27, 2024

@kkuepper it's probably better than just silently discarding uploaded files...

@ice1e0
Copy link
Author

ice1e0 commented Nov 27, 2024

@kkuepper i don’t know what database framework is used but I guess something like concurrency conflicts logic as EF Core uses could help here: https://learn.microsoft.com/en-us/ef/core/saving/concurrency?tabs=data-annotations
I think it filters the update with the last value of the concurrency column and when the update the affects zero rows, it throws an exception. Just an idea there might be pros and cons to such a logic and it might be difficult to implement

@kkuepper
Copy link
Member

kkuepper commented Dec 2, 2024

for now we decided to focus on working on automating the Fra Kåre publishing process instead of introducing concurrency handling.
(We use RavenDB as our database)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants