-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running OCR gives no results and NS_ERROR_FILE_NOT_FOUND
#88
Comments
First we can check whether the problem happens at the pdftoppm or at the tesseract stage. Are the PNG images saved to the Zenodo item folder? |
I am having the same issue and came here to see if anyone else was. Zotero 7.0.11 on Fedora Workstation 41.
Additionally, like the original post, I had to manually set the filepaths to /usr/bin/tesseract and /usr/bin/pdftoppm or it returned |
Since the OP didn't answer, maybe you can check whether pdftoppm did its job? |
Sure thing - I am not sure how to check so you might have to walk me through it. When I go to the item folder (Zotero item - right click - Show file) there is only the PDF file. |
If you have selected "Save the intermediate PNGs as well in the folder" like the OP, then pdftoppm has not worked at all. |
pdftoppm seems to hang:
|
|
Some troubleshooting I tried:
|
Hi, sorry for the delay and thank you for your help! I have the same results as @zzyzx-dc. |
Thanks for the details! pdftoppm manual execution: interesting. Are you sure that the command line is using the same executable? Without an explicit path there could be several versions on your system. Try to run |
Yeah it's the same one:
Running /usr/bin/pdftoppm yields viewable pngs. |
So apparently your pdftoppm installation is OK, that's a good data point, thank you. |
@zzyzx-dc I'm rewriting the pdftoppm/tesseract detection code for cases where no full path has been provided, so problems like this won't happen so much in the future. |
Hi, I reran things with the OCR language set to I've however just discovered the existence of the debug output feature in Zotero, if that can be helpful
|
The error is happening while the plugin is checking your tesseract path preference. I don't understand why this is the case, your screenshot says that it is However, I am not convinced that the failing code is really necessary - I am prepared to remove it. Still, a similar error could happen in a more useful check that is executed a few steps later, so I'd really like to understand the underlying situation. Before I create a new pre-version, could you try to run the following? In your normal shell:
In Zotero (menu Tools > Developer > Error Console):
|
I am on Ubuntu 24.04.1, and I have a same result. This is because I am fuked by the snap package that isolates the application runtime. You should advise against linux users against snap packages or any containerized deployment. |
btw, there are probably some workarounds to let snap see paths of the base system, but I dont think it worth the hassle. advise users to use https://github.com/retorquere/zotero-deb |
@alex-ca1123 While this could indeed be useful, I still wish the other users could provide the requested information. |
@aborel fedora has flatpak, same basic principal of evil vendorization efforts to fragment opensource community. https://discussion.fedoraproject.org/t/zotero-bibliography-manager-tarball-on-fedora-40-kde-how-i-got-it-working/132509 and I ran your directives, containerized app can't see host raw paths as expected. |
I get your point, it is certainly relevant, but it doesn't tell me what I wanted to know. The output of the requested commands is welcome. |
Having the same issue on Manjaro with Gnome Desktop and Zotero is installed as Flatpak. I also get the following message on Browser Console:
Running the commands returns: $ ls -l /usr/bin/tesseract
-rwxr-xr-x 1 root root 47256 11. Nov 09:22 /usr/bin/tesseract In my very limited understanding of Flatpak It requires either to bundle the binaries with the application or using |
Hi, just installed the plugin, when trying to OCR my first file I get the following error in the developer console:
I first thought that I had misconfigured tesseract/pdftoppm, but everything seems to look fine.. are there any ways to further investigate this ? I read through #87 but it doesn't seem related. Thanks !
Here's my configuration:
The text was updated successfully, but these errors were encountered: