Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graceful handling of Kobo on-device DB unicode issue #2629

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pohutukawa
Copy link

This is a fix for the long standing (3 1/2 years) bug https://bugs.launchpad.net/calibre/+bug/1936010

It arises when the SQLite DB on the Kobo device has an inconsistent (non-unicode conforming) character encoding in its fields. This patch handles the UnicodeDecodeError for the specifically affected record(s) gracefully without breaking compatibility with the inner working of Calibre. It also provides feedback at the end of the method invocation by printing out a statement of the affected books on the console.

I have tried to bring in functionality for an informative GUI popup dialogue window. But being a back-end kind of developer this is way out of my zone of capability. So it would be probably a good/easy one for @kovidgoyal to look at and fix up quickly for a future release (lines 2193-2196).

@kovidgoyal
Copy link
Owner

If you are catching the exception on cursor.fetchone() then row is either unset (if its the first row) or refers to the previous row.

Also, if the database has invalid utf8 in it, it means the db is probably corrupted, I doubt deleting books is going to be a robust fix.

@pohutukawa
Copy link
Author

Well, it looks like the Kobo device DB can get corrupted. As it has happened to the one opening the bug and myself as well. I have also seen that every now and then an 'eject' action doesn't seem to complete. And at some point one is then just pulling the cable. So it could have happened there.

So the note that (perhaps) removing and re-adding the title may fix things seems like a logical one if something 'odd' has happened at the time. I'll try that here, but don't want to do that (just yet), so that I can still reproduce the bug/issue here while working on it.

Anyway, there is an issue that can arise, and the work leading to the PR has identified the underlying issue and selectively only triggers on the title exposing the problem. Or did I get something wrong? I'll have to try again, maybe the value row after treating the exception is still pointing to the previous record. That's a possibility ... :-/

Either way, whether doing a for row in cursor or getting each row via a cursor.fetchone() should not matter much. At least all unaffected records will have been (correctly) fetched, and the UI can present that using the green ticks in the device column. Previously, with the error happening, Calibre had no idea (at least not presented in the UI) of which titles were on the device. This way this works again.

Therefore, we're a large leap forward already. We might just want to be more informative on which record has caused the error.

Do we agree on this part? If yes, then we may just have to sort out the rest.

@kovidgoyal
Copy link
Owner

Not really. IMO once the db is corrupted trying to present information
from it is not really worth the effort. That information can be
arbitrarily wrong. The user is just better off logging off and logging
on again making the Kobo rebuild its database, see the calibre FAQ about
it. Sadly the Kobo firmware has always been extremely buggy with memory
leaks, db corruption, crashes, etc.

In the case of your PR you are definitely telling the user the
problem is in the wrong row. Deleting the previous book isn't going to
help. In this case it would be better to just show a popup to the user
telling them the device db is corrupted and they need to rebuild it.
However, this is only one way db corruption can show up, there are
many other places where operations can fail.

Note: If you want to preserve the db for testing copy it off your kobo.

@pohutukawa
Copy link
Author

Very interesting, you're right. I've drilled more into it. I've diffed the output I got from the Python query and the equivalent from a raw SQLite query on the embedded DB file. The offending record is adjacent to what I had spit out with print on the console. And removing (and/or re-adding) has not fixed the problem, as the Kobo keeps an 'empty' (tomb stone) record with offending UTF-8 encoded content around, just with some of the fields emptied (I haven't looked into the exact changes of the record).

So, it's Kobo's firmware that has screwed up. It's obvious from the content that some pointer got skewed and is offset by a number of bytes to cause the scrambling with a lot of nonsensical content. Though, the Kobo internally continues to work with it.

@kovidgoyal, how would you be considering a 'fix' for the situation that at least lets Calibre continue working, but at the end of mounting the device spit out a GUI pop up informing the user that there's an inconsistency in the device. If proceeding working with it is on their own, and it is suggested to go and completely re-build (as you've mentioned above).

That way at least one can more easily do a stock take of the 'state of the device' more easily with existing content on it to assist in recovering. Especially as a rebuild can be quite disruptive due to a number of books partway in progress, bookmarks, etc. etc. etc.

It might be a helpful addition to try and be more graceful without being completely 'full of grace' to recover he full situation.

It's certainly helping me now to get back to see on what books are on the device without the untreated exception. Even re-adding the one faulty book now leaves the broken 'tomb stone' behind, but it won't cause any hiccups in the Calibre GUI any more.

I'd refine the PR to that extent, but would need input or assistance (from you?) on how to present a suitable GUI dialogue at the end.

@kovidgoyal
Copy link
Owner

You can raise a UserFeedback exception. See how its done in modify_database_check(). But you should also set a flag on the device instance to prevent any database modifications by calibre just like modify_database_check() works. Remeber to clear the flag when opening the device.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants