-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add native text rendering to muPDF backend #1159
Conversation
another possibility for the text rendering is to use |
Sorry, but I will not add more complexity to the rendering process. |
Are you able to elaborate on what you don't like about this solution? Maybe I can find something that you like better? |
if bbox is None: | ||
return | ||
abstract_font = self.text_engine.get_font(font_face) | ||
self.backend.draw_text( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to bypass the clipping stage, so text in viewports and clipped INSERTs will be draw at any time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is clipping done in the pipeline? In which case I think you are correct, I didn't really handle clipping so I can't comment on if that would be difficult to add or not
@@ -424,6 +429,46 @@ def draw_image(self, image_data: ImageData, properties: BackendProperties) -> No | |||
oc=self.get_optional_content_group(properties.layer), | |||
) | |||
|
|||
def register_font(self, font: AbstractFont) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this implementation cannot handle SHX fonts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume so. I wonder what autocad / other cad programs do when exporting if non-ttf fonts are used?
If both exact rendering and selectable/interpretable text was desirable then invisible text could be layered on top of the baked glyphs hypothetically
I liked the fact that the burden of rendering text in backends was removed. This feature is only optional, but questions will still arise as to why the text looks different with different backends. This implementation skips the clipping stage and renders text outside of VIEWPORTs and clipped INSERT entities and of course cannot render SHX fonts. I think this feature causes more problems than it solves. |
I definitely see your viewpoint that users may be confused by the edge cases and that discarding text information before the backend results in simpler backends and I wouldn't suggest that this text policy be set as the default for that reason. But I would think in some cases the ability to further process/analyze the output outweighs inaccuracies in rendering. I am not a user with this requirement though so I don't mind if we skip this feature. It is a shame that without giving the backend more access, a user with this requirement cannot easily maintain a custom backend that handles text differently. I think a large restructuring would be required to allow this flexibility. For now I suppose anyone with the requirement for text information in the resulting pdf can use this branch and let me know if they want it rebased in future. |
I created this tool with my needs in mind (I have been working with CAD for civil engineers for over 25 years) and I don't understand why users would want to create, edit and extract text in DXF files when that's what CAD applications are for. I have always been interested only in geometry stored in DXF files and automating geometry creation - an application independent scripting tool. However, if someone wants to select/extract text from a DXF file I recommend a tool called import ezdxf
doc = ezdxf.readfile("your.dxf")
for entity in doc.query("MTEXT TEXT"):
print(entity.dxf.text) # or write it into a file |
There is some benefit in being able to access the text in its rendered form (i.e. position, colour etc) as DXF does not store this information plainly (hence the need for the rendering frontend). However, the use case for this may be niche and I'll let users who have an actual use case advocate for it if there are any. Another benefit for letting PDF handle the text is smaller file sizes but again that may not be a priority. |
This is an investigation into the feasibility of rendering text to PDF in a way that a viewer application is able to understand the text information (i.e. be able to select and copy the text). This was requested in #1158 .
This PR introduces the
TextPolicy.NATIVE
setting to pass text information directly to the backend rather than 'baking' it to a series of glyph shapes since this looses the original text information.If the backend supports text it can implement the
draw_text
method which gets called whenTextPolicy.NATIVE
is used. The PDF backend supports loading fonts and rendering text with arbitrary 2D transformations defined by a matrix.The current implementation relies on a magic number to scale the font size correctly to match the baked text from the frontend (which I am treating as ground truth). For the files I have access to the value of 1.375 results in almost identical text however other fonts may require a different scale factor to get exact results.
In the following screenshot white is baked text and red is 'native' PDF text