-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to extract color of text segment? #121
Comments
Not right now, I think we would need to add a color parameter to the StructuredText walk method (https://mupdfjs.readthedocs.io/en/latest/classes/StructuredText.html#walk). I think this may be possible to do, but I will refer to @ccxvii for feasibility here. |
awe!
const document = Document.openDocument(arrayBuffer, './AlphaZero-like tree-search canuide LLM decoding and training 5.pdf')
const page = document.loadPage(0)
console.log('doc', JSON.parse(page.toStructuredText().asJSON())) |
Hmmmm, this is wrong, it would be better obviously if the text object had inline styling markup - we should look further into this. |
Yes, <p style="top:538.2pt;left:307.1pt;line-height:10.1pt">
<b><span style="font-family:NimbusRomNo9L,serif;font-size:10.1pt">Task Setups</span></b>
<span style="font-family:NimbusRomNo9L,serif;font-size:10.1pt"> For a given MDP, the nature of the search</span>
</p> |
I found another options to solve this: page.toStructuredText('preserve-spans').asJSON() |
Sure - but I still think that we need to add something to the API to return the font color info. |
Yes, that'll be great! lft 👀 |
just like:
The text was updated successfully, but these errors were encountered: