Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greek alphabets like Θ can't be recognized #3379

Closed
ttbuffey opened this issue Apr 6, 2021 · 4 comments
Closed

Greek alphabets like Θ can't be recognized #3379

ttbuffey opened this issue Apr 6, 2021 · 4 comments
Labels

Comments

@ttbuffey
Copy link

ttbuffey commented Apr 6, 2021

Environment

  • Tesseract Version: 4.0
  • Commit Number:
  • Platform: mac

Current Behavior:

Command:
tesseract ~/Desktop/detector_sample_data-1.jpg ~/Desktop/result -l eng+ell+grc

Expected Behavior:

Greece National ID Card: Θ-420001 which has greek alphabets should be recognized, but 0-420001 is recognized

Suggested Fix:

for a english document which has few greek alphabets, the greek alphabets αβγδεζηθικλμνξοπρστυφχψωΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩABEZHIKMNOPTYX should be recognized in the right way

@stweil
Copy link
Member

stweil commented Apr 6, 2021

This is not a software issue but more or less normal for OCR.

You can try using -c load_number_dawg=0 and see whether that helps.

Please use the Tesseract user forum for more questions.

@stweil stweil added the question label Apr 6, 2021
@amitdo
Copy link
Collaborator

amitdo commented Apr 6, 2021

With the lstm engine, letters sequences that were not learnt during training have a low chance to be recognized.

@stweil
Copy link
Member

stweil commented Apr 6, 2021

Yes, training some Greece National ID Card (~100 or more) with tesstrain would help to get better results.

@amitdo amitdo closed this as completed Apr 9, 2021
@ttbuffey
Copy link
Author

I have a question same as described in the below issue
Shreeshrii/tess5train-fonts#15

@stweil @amitdo Could you please point out why the performance decrease after I fine tune the english model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants