Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge music branch to develop. #56

Merged
merged 82 commits into from
Nov 14, 2024
Merged
Changes from 1 commit
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
fc73bc9
Add scripts for exporting music from PageLayout to MIDI + MusicXML fi…
vlachvojta Sep 1, 2023
bda025a
Add translator dictionary, defining translation from internal shorten…
vlachvojta Sep 7, 2023
fe9aceb
Add base for LayoutEngineYolo using ultralytics YOLO. With conversion…
vlachvojta Sep 12, 2023
2da0f24
Junk-code cleanup and docu.
vlachvojta Sep 13, 2023
2f661d2
Prepare attributes for music-text distinction. WITHOUT exports to Pag…
vlachvojta Sep 13, 2023
5373283
Add support for `region.music_region`. WITHOUT exports to PageXML or …
vlachvojta Sep 14, 2023
1b9b676
Add category attribute to region. WITH saving to pageXML custom tag +…
vlachvojta Sep 14, 2023
7cc5293
Add category attribute to TextLine. WITH saving to pageXML custom tag…
vlachvojta Sep 14, 2023
fae66ce
Little refactoring.
vlachvojta Sep 14, 2023
b02ebb2
Add exporting music directly in `parse_folder.py` using `config.ini` …
vlachvojta Sep 14, 2023
c07ba6a
Fix minor issue with non-existing function `get_las_region_id`
vlachvojta Sep 18, 2023
0afb958
Add sorting music regions "in reading order" using `y_min` of boundin…
vlachvojta Sep 18, 2023
9371467
Remove `RegionCategory` and `LineCategory` enums hard-coded in `layou…
vlachvojta Oct 3, 2023
b733c74
Update `MusicExporter` to export music only from certain categories o…
vlachvojta Oct 3, 2023
b40e479
Remove music exporter option from `parse_folder.py` and make `export_…
vlachvojta Oct 25, 2023
7b93d16
Add option to have more LineCroppers and ORC engines. Set every other…
vlachvojta Oct 27, 2023
50418dd
Disable throwing error if no crop for line. Continue and ignore line.
vlachvojta Oct 27, 2023
8853046
Add PageLayout splitting enabling running multiple layout parsers eac…
vlachvojta Nov 3, 2023
5ce86d1
Remove unused functions.
vlachvojta Nov 3, 2023
16e8ce3
Merge branch 'develop' into music
vlachvojta Nov 3, 2023
3e2fcb8
Add simple script to check if page layouts in two folders have same s…
vlachvojta Dec 1, 2023
a49409c
Merge remote-tracking branch 'origin/develop' into music
vlachvojta Dec 1, 2023
94bcae8
Disable double logging (stdout + stderr)
vlachvojta Dec 1, 2023
d688bc0
Refactor page xml export + import.
vlachvojta Dec 1, 2023
65eacb5
Refactor alto xml export.
vlachvojta Dec 1, 2023
bb88217
Merge remote-tracking branch 'origin/develop' into music
vlachvojta Dec 7, 2023
3354a5b
Minor updates
ikiss-fit Dec 14, 2023
59dd330
Unify most of method names: page_xml to pagexml and alto_xml to altoxml.
vlachvojta Dec 14, 2023
3c3322e
Unify most of the method names: page_xml to pagexml and alto_xml to a…
vlachvojta Dec 14, 2023
27a82eb
New config section parsing and other changes after code review.
vlachvojta Dec 14, 2023
ae516f4
Add image_size to Yolo engine. Add `config_get_list` to get list of c…
vlachvojta Dec 22, 2023
e5dd2b8
Store box confidence (in LayoutExtractorYOLO) to RegionLayout and exp…
vlachvojta Dec 22, 2023
529234e
Add line ID to ALTO export + import.
vlachvojta Dec 22, 2023
036daf3
Delete unwanted script.
vlachvojta Dec 22, 2023
697990c
Add translating short music output to original encoding.
vlachvojta Dec 22, 2023
893baca
Enable loading model to cpu.
vlachvojta Dec 29, 2023
e2a26de
Add `CATEGORIES` option to sorters and delete therefore unused functi…
vlachvojta Dec 29, 2023
d59884a
Alto_export: export music transcription as one string in each TextLine
vlachvojta Jan 4, 2024
0a1a0c0
Tiny improvements.
vlachvojta Jan 15, 2024
faf0496
Delete unused function from `layout`, music integration.
vlachvojta Jan 16, 2024
dd4bbc8
Change simple print warnings to `logger.warning`.
vlachvojta Jan 18, 2024
6fc82e4
Change config categories, line_categories, add decoder filter.
vlachvojta Jan 18, 2024
53ee27a
Rename `MusicTranslator` for to more general `OutputTranslator` and e…
vlachvojta Jan 18, 2024
68e7892
Add option for rendering region categories for non-text regions.
vlachvojta Jan 22, 2024
8ac485e
Tiny refactor before moving.
vlachvojta Jan 22, 2024
7032389
Add minimalistic CLI for `MusicPageExporter` to `user_scripts/export_…
vlachvojta Jan 22, 2024
ef5f36f
Normalize category characters in image rendering.
vlachvojta Jan 23, 2024
8c738ee
Add confidence estimation to PageOCR directly after detection. Update…
vlachvojta Jan 23, 2024
949829a
Add `PageOCR.get_line_confidence` solving problem of wrong confidence…
vlachvojta Jan 26, 2024
0b6e933
Unify logging style.
vlachvojta Jan 26, 2024
092b04f
Update readme, remove translator.Semantic_to_SSemantic.json because i…
vlachvojta Jan 26, 2024
cc47eef
Improve translation of symbols in `output_translator.py`. Return orig…
vlachvojta Jan 26, 2024
c7d90a1
Add `atomic` option to `OutputTranslator` + output substitution toggl…
vlachvojta Jan 31, 2024
d9430bc
Merge branch 'develop' into music.
vlachvojta May 28, 2024
2f84712
Fix `provides_ctc_logits` to look to all `ocrs` instead of `ocr`
vlachvojta May 30, 2024
eddf0e3
Change `SUBSTITUTE_OUTPUT_ATOMIC` to work on a page level, not indivi…
vlachvojta May 30, 2024
70a7e35
Add config parameter `UPDATE_TRANSCRIPTION_BY_CONFIDENCE`
vlachvojta May 30, 2024
9dcd33f
Add ALTO baseline (export + import) in two options (float or points)
vlachvojta Jun 17, 2024
1a46c00
Add ALTO versions (options how to export baseline) + both baseline im…
vlachvojta Jun 19, 2024
2141002
Save polygon points only as positive numbers. (XSD validation issue)
vlachvojta Jun 19, 2024
34c6584
Remove prints.
vlachvojta Jun 19, 2024
82b3e70
Allow run when at least one ORC engine `provide_ctc_logits`.
vlachvojta Jun 19, 2024
2aed4bc
Fix README.md example + delete false info about setup.py.
vlachvojta Jun 19, 2024
4efdbab
Update README.md - spelling correction.
vlachvojta Jun 20, 2024
134f51a
Add typing Optional to allow lower versions of Python (tested on Pyth…
vlachvojta Jun 20, 2024
25f555c
Add libraries needed to install in docker installation.
vlachvojta Jun 20, 2024
26b0c1a
Update texts for better UX.
vlachvojta Jun 20, 2024
3ce8bbc
Make default version of ALTO to the older one.
vlachvojta Jun 25, 2024
3ec7902
Add typing List and Tuple to allow lower versions of Python (tested o…
vlachvojta Jun 25, 2024
520e3ae
Fix page_xml "custom" field export to export category only if not None.
vlachvojta Jun 27, 2024
9b414a0
Set category filter fallback to `[]` for backward compatibility.
vlachvojta Jun 27, 2024
a110f0e
Library versions fixes.
vlachvojta Jun 27, 2024
f5a7a51
Add libraries back to pyproject.toml, so new machines install it righ…
vlachvojta Jul 11, 2024
cc9b0ae
Fix bugs according to Pull request comment.
vlachvojta Aug 5, 2024
62af812
Merge remote-tracking branch 'origin/develop' into music
vlachvojta Aug 5, 2024
b60196f
Add regions to splitting by category. If `region.category` set, move …
vlachvojta Aug 5, 2024
4d2ddaa
Add better None check.
vlachvojta Aug 6, 2024
7c4251e
Disable exporting midi lines if no notes on the line.
vlachvojta Aug 27, 2024
d9c64cd
Merge branch 'develop' into music
vlachvojta Sep 24, 2024
fa1a897
Simplify splitting page layouts to allow backwards (only look at regi…
vlachvojta Oct 16, 2024
f5f2f42
Add IndexError to catch expression when calculating transcription con…
ikiss-fit Oct 25, 2024
747e491
Update layout.py
michal-hradis Nov 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Simplify splitting page layouts to allow backwards (only look at regi…
…on category, None = 'text')
vlachvojta committed Oct 16, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit fa1a897eb0f0c4dfebc878d74dce88949b0c673e
70 changes: 27 additions & 43 deletions pero_ocr/layout_engines/layout_helpers.py
Original file line number Diff line number Diff line change
@@ -414,22 +414,27 @@ def adjust_baselines_to_intensity(baselines, img, tolerance=5):


def split_page_layout(page_layout: PageLayout) -> Tuple[PageLayout, PageLayout]:
"""Split page layout to text and non-text lines."""
"""Split page layout to text and non-text regions."""
return split_page_layout_by_categories(page_layout, ['text'])


def split_page_layout_by_categories(page_layout: PageLayout, categories: list) -> Tuple[PageLayout, PageLayout]:
"""Split page_layout into two: one with textlines of given categories, the other with textlines of other categories.
"""Split page_layout into two by region category. Return one page_layout with regions of given categories and one with
regions of other categories. No region category is treated as 'text' for backwards compatibility.
If no categories, return original page_layout and empty page_layout.
! TextLine categories are ignored here !

Example:
split_page_layout_by_categories(page_layout, ['text'])
IN: PageLayout(regions=[
RegionLayout(id='r001', lines=[TextLine(id='r001-l001', category='text'),
TextLine(id='r001-l002', category='logo')])])
RegionLayout(id='r001', category='text', lines=[TextLine(id='r001-l001', category='text'),
TextLine(id='r001-l002', category='logo')]),
RegionLayout(id='r002', category='image', lines=[TextLine(id='r002-l001', category='text')])])
OUT: PageLayout(regions=[
RegionLayout(id='r001', lines=[TextLine(id='r001-l001', category='text')])]),
PageLayout(regions=[
RegionLayout(id='r001', lines=[TextLine(id='r001-l002', category='logo')])])
RegionLayout(id='r001', category='text', lines=[TextLine(id='r001-l001', category='text'),
TextLine(id='r001-l002', category='logo')])])
PageLayout(regions=[
RegionLayout(id='r002', category='image', lines=[TextLine(id='r002-l001', category='text')])])
"""
if not categories:
# if no categories, return original page_layout and empty page_layout
@@ -444,23 +449,16 @@ def split_page_layout_by_categories(page_layout: PageLayout, categories: list) -
page_layout_negative = deepcopy(page_layout)

for region in regions:
if region.category is not None or len(region.lines) == 0:
if region.category in categories:
page_layout_positive.regions.append(region)
else:
page_layout_negative.regions.append(region)
region_category = region.category if region.category is not None else 'text'
if region_category in categories:
page_layout_positive.regions.append(region)
else:
for line in region.lines:
if line.category in categories:
page_layout_positive = insert_line_to_page_layout(page_layout_positive, region, line)
else:
page_layout_negative = insert_line_to_page_layout(page_layout_negative, region, line)

page_layout_negative.regions.append(region)
return page_layout_positive, page_layout_negative


def merge_page_layouts(page_layout_positive: PageLayout, page_layout_negative: PageLayout) -> PageLayout:
"""Merge two page_layouts into one by line. If same region ID, create new ID.
"""Merge two page_layouts into one by regions. If same region ID, create new ID (rename line IDs also).

Example:
IN: PageLayout(regions=[
@@ -469,39 +467,25 @@ def merge_page_layouts(page_layout_positive: PageLayout, page_layout_negative: P
RegionLayout(id='r001', lines=[TextLine(id='r001-l002', category='logo')])])
OUT: PageLayout(regions=[
RegionLayout(id='r001', lines=[TextLine(id='r001-l001', category='text')]),
RegionLayout(id='r002', lines=[TextLine(id='r002-l002', category='logo')])])
RegionLayout(id='r001-1', lines=[TextLine(id='r001-1-l002', category='logo')])])
"""
used_region_ids = set(region.id for region in page_layout_positive.regions)

id_offset = 0
for region in page_layout_negative.regions:
if region.id not in used_region_ids:
used_region_ids.add(region.id)
page_layout_positive.regions.append(region)
else:
while 'r{:03d}'.format(id_offset) in used_region_ids:
new_region_id = region.id
id_offset = 1

# find new unique region ID by adding offset
while new_region_id in used_region_ids:
new_region_id = region.id + '-' + str(id_offset)
id_offset += 1
region.replace_id('r{:03d}'.format(id_offset))
used_region_ids.add(region.id)

region.replace_id(new_region_id)
used_region_ids.add(new_region_id)
page_layout_positive.regions.append(region)

return page_layout_positive


def insert_line_to_page_layout(page_layout: PageLayout, region: RegionLayout, line: TextLine) -> PageLayout:
"""Insert line to page layout given region of origin. Find if region already exists by ID."""
existing_region = find_region_by_id(page_layout, region.id)

if existing_region is not None:
existing_region.lines.append(line)
else:
region.lines = [line]
page_layout.regions.append(region)
return page_layout


def find_region_by_id(page_layout: PageLayout, region_id: str) -> Optional[RegionLayout]:
for region in page_layout.regions:
if region.id == region_id:
return region
return None