-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I correct line recognition? #33
Comments
This is really a problem - please implement support for fixing the text line boundaries and use them to make a better model. |
We are dubugging layout editor. Normally, you edit text in /ocr/show_results/... The new interface is running at /ocr/show_results_new/... Just manually change the URL. I'll probably have to explain how the interface works. |
Yes please, I will need some explanation, the interface is not intuitive. Default zoom should zoom to fill the window width or height, not zoom out so much as it does now. There is no way to delete a row. I have no idea how to select two rows to join them. How to resize a row, how edit a shape of a region. |
@michal-hradis please add the explanation here. Also, how to revert OCR without losing the manually edited baselines. |
Text transcriptions can be generated again and again without any loss of manual text corrections. Text line detection can not be repeated without loosing manual corrections. How to edit text lines:
To delete lines :
Alternative way to delete lines which can not be rolled back and which tends to delete whole text region if you are not carefull at the moment:
Add lines:
Edit regions:
You can further:
|
When correcting OCR for National Museum, the regions were mostly recognized without problems. When correcting OCR, however, I found cases where part of word was not detected as part of line. Sometimes, it has not been detected at all:
Other times: there were overlapping or duplicate detections:
How should I treat it when correcting?
Should I write the whole line, even the letters that are not part of the line as marked on the image? Should I write the same text twice, when correcting overlapping lines?
The text was updated successfully, but these errors were encountered: