-
Notifications
You must be signed in to change notification settings - Fork 267
spoiler with summary, markdown-it style container extensions #1043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spoiler with summary, markdown-it style container extensions #1043
Conversation
|
markdown-it and commonmark-hs both use this syntax for general purpose custom fenced blocks, not just spoilers. I think this extension should do the same thing: create a |
thanks for the information @notriddle do you think this very specific use case could exist behind a more specific feature flag? is there any obvious code issues you can see with this implementation? |
|
@notriddle - so if i make this PR more generic, similar to the markdown-it container extension, and implement the spoiler variation of the container extension, that would be acceptable? |
|
Personally, I think so, but @Martin1887 needs to approve, too. |
|
In a first look it seems good, but I good like to review it carefully before merging. Thanks! |
|
thanks for the reply @Martin1887 i think i would like to redesign it a bit to make it similar to the markdown-it container extension and implement spoilers as a specific variant of that so that it fits an existing standard as per @notriddle 's suggestion does that seem like a good idea to you too? |
|
Yes, it is a good approach, and container extensions would be under a feature flag. |
|
i've refactored things as i hope we discussed @notriddle @Martin1887 please let me know if i need to make changes |
|
The "arbitrary kind" string ( I'll try to think about this more in the coming days. |
|
Good point, thanks! |
|
i have updated the PR to reflect all the changes suggested with the exception of this test i'm ok with what this PR generates but i know you like to be compatible |
|
Cool, thanks! Do you think you can make nesting work as well? gives while I expected nested divs |
|
while investigating this bug i changed the container parsing logic significantly i have added more tests and, of course, now everything passes this PR does not support the any feedback welcome |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, and apologies again about the delay. This is starting to look quite robust, but there are still some issues. Nesting is tricky and I guess we don't have any other feature to take inspiration from directly.
Consider this:
::: container
> shouldn't close, right?
> :::
> x
:::
<div class="container">
<blockquote>
<p>shouldn't close, right?</p>
</blockquote>
</div>
<blockquote>
<p>x
:::</p>
</blockquote>Another thing we try to stay conscious of is time complexity, to make sure untrusted input cannot cause parsing to become too slow. The current implementation seems to be quadratic when nesting containers.
With input generated like this (replace N with some number):
python -c 'print("\n".join(":"*k + "x" for k in range(N, 2, -1)))'If I double the value of N, the size of the input quadruples (more or less). However, the runtime seems to slow down by ~16x:
| N | time (s) |
|---|---|
| 1000 | 0.32 |
| 2000 | 4.99 |
| 4000 | 85.80 |
That would indicate quadratic run time growth with respect to input size, which we generally want to avoid.
One solution to this is to limit the amount of nesting (or simply colons recognized, before saturating (similar to how {} are handled in math spans).
Don't stress about these too much. The runtime issue can definitely be fixed later. For now it would be enough to have something that works and is well specified by the spec tests. If nesting proves too difficult, we can also leave it out and try to get it working later.
|
thanks for the feedback @ollpu will get stuck into it when i can 😃 |
|
hi @ollpu @Martin1887 @notriddle tried to address everything mentioned by @ollpu let me know what you think 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good work! I like the approach, but the container_depth counter is a pervasive change to firstpass.rs and I believe it's not really needed. Same with the depth field of ItemBody::Container. I'm still learning new things about the feature in commonmark.hs and others - apparently nesting can be done without increasing fence lengths too 😄
I gather that FirstPass.container_depth has two purposes: to enforce a nesting limit and to ensure that ::: can't close things outside their scope.
The first use case could be replaced by checking self.tree.spine_len(). That more or less provides the current container nesting level.
For the second use case, we can instead determine the scope from the ancestors when iterating self.tree.walk_spine().rev(). Whenever we encounter a node other than a Container or Paragraph, we can stop iterating. No container blocks that were opened outside that node should be closed inside it. I've implemented this idea in the first two code suggestions below - tests pass. The depth field of ItemBody::Container becomes unnecessary.
My suggestions also align the following test with commonmark-hs. A closing fence can close container blocks that were started outside other container blocks (but not other kinds of nodes):
::: a
:::: b
::::: c
::::
x
commonmark-hs:
<div class="a">
<div class="b">
<div class="c">
</div>
</div>
<p>x</p>
</div>
These tests are getting a little abstract aren't they...
pulldown-cmark/src/firstpass.rs
Outdated
| let fence_length = | ||
| scan_ch_repeat(&bytes[(start_ix + line_start.bytes_scanned())..], b':'); | ||
|
|
||
| if fence_length > u8::MAX as usize || self.container_depth == u8::MAX { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With regards to using u8 for the fence length, I think a behavior like this would be more consistent with other features:
If fence_length > 255, treat it as if it were only 255. So 255 colons is enough to close a container that was started with 500 colons. The feature still keeps working, but the count isn't checked beyond the limit.
Test case: python -c 'print(":"*256 + " x\n" + ":"*255)' should parse as a container without literal :s left over
pulldown-cmark/src/parse.rs
Outdated
| IndentCodeBlock, | ||
| HtmlBlock, | ||
| BlockQuote(Option<BlockQuoteKind>), | ||
| Container(u8, u8, ContainerKind, CowIndex), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment describing what the fields mean? Or just make it a struct variant with named fields - I think we want to switch the others to that sooner or later as well.
pulldown-cmark/src/scanners.rs
Outdated
| let fence_length = | ||
| scan_rev_while(&self.bytes[self.ix..(self.ix + nl_ix - eol_length)], |c| { | ||
| c == b':' | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not intentional that the closing fence doesn't need to be on its own line, but any other content from the beginning is discarded?
::: a
content :::
gives
<div class="a"></div>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i understand the spec correctly:
https://github.com/jgm/commonmark-hs/blob/master/commonmark-extensions/test/fenced_divs.md
It ends with a closing fence: a line beginning with at least as many consecutive : characters as in the opening fence, followed by optional whitespace and the end of the line.
so this will be closed by default and interpret the :s as text. i have added the test accordingly
| pub(crate) fn scan_nextline(bytes: &[u8]) -> usize { | ||
| memchr(b'\n', bytes).map_or(bytes.len(), |x| x + 1) | ||
| memchr(b'\n', bytes).map_or(memchr(b'\r', bytes).map_or(bytes.len(), |x| x + 1), |x| { | ||
| x + 1 | ||
| }) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is a bit scary. I think it reveals some bugs in the existing uses of the function, since CommonMark defines a lone carriage return \r as a valid line ending. And I don't think this change will make it work correctly in that case either, as it's just a fallback if no \n are found anywhere.
We should carefully evaluate the impact before making this change.
Does this fix something for you even when the input is all CRLF (\r\n)? Such as if you're given a slice where the \n has been stripped out, but not \r? In that case I would try to find another more local fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #1056
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have reverted this change so it can be addressed properly as you suggest
Co-authored-by: Roope Salmi <rpsalmi@gmail.com>
Co-authored-by: Roope Salmi <rpsalmi@gmail.com>
Co-authored-by: Roope Salmi <rpsalmi@gmail.com>
|
amazing feedback @ollpu thanks!
the more the merrier 😄 i have tried to implement everything you suggested let me know if there's anything else |
notriddle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pretty much looks good to me. The test case covers most of the possibilities, and the implementation fits in with the rest of the code.
Other than some low-hanging optimization fruit, it looks fine.
Co-authored-by: Michael Howell <michael@notriddle.com>
ollpu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good now. Thanks for your continued efforts.
this is an attempt at an implementation of markdown-it container extensions. here's one of the tests from the PR as an example of the default behavior:
i have also implemented a spoiler extension (my specific use case). here is another example test from the PR: