-
-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the handle_backticks issue #426
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This allocator allocates a 4MiB arena into which all allocations are made, and then increasingly larger arenas as earlier ones are used up. Freeing memory in the arena is a no-op: clean all memory with cmark_arena_reset(). In order to support realloc, we store the size of each allocation in a size_t before the returned pointer. The speedup is over 25% on large (benchmark-sized) inputs -- we pay a small increase in maximum RSS (~10%) for this.
Note this includes a hack to the core code to escape pipes in the 'commonmark' renderer. This is to fix test cases with the table extension; i.e. we treat pipes as special characters that need escaping. We use the cmark_mem of the parser in order to ensure we use the arena allocator when necessary. A very flexible table format is supported; see test/extensions.txt for examples. Leading and trailing pipes can be omitted, and alignment specifiers can be used in the separator between the header and body. Table bodies don't need to be a consistent width. Embedded HTML is OK. Note we reuse the inline parser from cmark to parse tables -- this is to ensure pipes e.g. in the middle of an inline code block don't prematurely terminate a table cell.
This is quite straightforward; we do take care in other extensions (i.e. autolink) to ensure tildes are left for the strikethrough extension to consume.
The autolinker is based on https://github.com/vmg/rinku with some additional changes and fixes. We do our best not to include punctuation, but to include matching parentheses within a link.
When we encounter a tag that causes an HTML 5 parser's content model flag [1] to be changed to RCDATA, CDATA or RAWTEXT [2] [3], we escape the tag by replacing its opening "<" with "<". This causes the tag to appear verbatim in the page it's placed on. We do this to prevent users breaking the page content, where the parser would not interpret further tags as inserted by cmark as HTML until a matching close tag was hit. (Such a closing tag could exist if a user entered it themselves, but it'd cause all cmark-generated markup in between to be rendered raw, and is unlikely to be desireable behaviour.) [1] https://www.w3.org/TR/2009/WD-html5-20090423/syntax.html#tokenization [2] https://www.w3.org/TR/2009/WD-html5-20090212/serializing-html-fragments.html#parsing-html-fragments [3] https://github.com/google/gumbo-parser/blob/aa91b27b02c0c80c482e24348a457ed7c3c088e0/src/parser.c#L4023-L4053
* Include table alignment when rendering LaTeX * Include table alignment when rendering man (preserving default centre alignment here) * Trim table cell interiors * Expand test cases * Fix escaping behaviour * Do not use enum for alignment * Do not collide against stdlib `ispunct` * Cleanup pipe code * Don't reparse matched rows
(caller requires :// anyway)
* Add failing test. * Fix by parsing inlines after blocks are done
rdar://76711302
rdar://77383424
rdar://77476197
since cmark_parser_reset is called in cmark_parser_finish, this state would be inconsistent if you reused a parser with extensions multiple times.
rdar://79015293
this silences the "header should be renamed to be used as an umbrella header" warning
rdar://81302358
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cmark_syntax_extension_get_private()
(#36)finalize
cmark_node_type CMARK_NODE_TABLE
etc., make XCode happy with imported headers. (#96)~
should not be escaped in href (#110)ext_scanners.c
with latestre2c
.O(n*n)
corner-case runtime in GFM's table extension.