-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✏️ Better HTML conversion options #98
Conversation
core/harambe_core/errors.py
Outdated
@@ -4,6 +4,13 @@ class HarambeException(Exception): | |||
pass | |||
|
|||
|
|||
class UnknownHTMLConverter(HarambeException): | |||
def __init__(self, converter_type: any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The converter_type
parameter should be typed as Any
instead of any
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ellipsis is right, it should be Any
from the typing module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call
|
||
def get_html_converter( | ||
html_converter_type: HTMLConverterType | None, | ||
) -> MarkdownConverter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type of get_html_converter
should be Union[HTMLToMarkdownConverter, HTMLToTextConverter]
instead of MarkdownConverter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
) | ||
|
||
assert len(observer.data) == 1 | ||
assert observer.data[0]["text"].strip() == ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expected output in test_capture_html_table
should not contain markdown syntax when using html_converter_type="text"
. Adjust the expected output accordingly.
Important
Adds HTML conversion feature with markdown and text options, including error handling and tests.
html_converter_type
parameter tocapture_html()
incore.py
for specifying conversion type.UnknownHTMLConverter
error for unsupported types inget_html_converter()
.HTMLToMarkdownConverter
andHTMLToTextConverter
for HTML to markdown and text conversion.get_html_converter()
inhtml_converter/__init__.py
to select converter based on type.UnknownHTMLConverter
exception inerrors.py
for handling unknown types.test_capture_html_conversion_types()
intest_e2e.py
to verify conversion to markdown and text.heading.html
andtable.html
for testing.0.53.0
inpyproject.toml
anduv.lock
.This description was created by
for ed3ae6b. It will automatically update as commits are pushed.