Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

@jim-taylor-business
Copy link
Contributor

@jim-taylor-business jim-taylor-business commented Jun 11, 2025

this is an attempt at an implementation of markdown-it container extensions. here's one of the tests from the PR as an example of the default behavior:

::: example
Is this collapsable?

Is this **bold**?
:::
.
<div class="example">
<p>Is this collapsable?</p>
<p>Is this <strong>bold</strong>?</p>
</div>

i have also implemented a spoiler extension (my specific use case). here is another example test from the PR:

::: spoiler Is this expandable?
Is this collapsable?

Is this **bold**?
:::
.
<details>
<summary>Is this expandable?</summary>
<p>Is this collapsable?</p>
<p>Is this <strong>bold</strong>?</p>
</details>

@notriddle
Copy link
Collaborator

markdown-it and commonmark-hs both use this syntax for general purpose custom fenced blocks, not just spoilers.

I think this extension should do the same thing: create a <div> with the attribute string as a class by default, and allow an application-defined filter to turn it into whatever other element they want.

pulldown-cmark/src/scanners.rs Outdated Show resolved Hide resolved
@jim-taylor-business
Copy link
Contributor Author

markdown-it and commonmark-hs both use this syntax for general purpose custom fenced blocks, not just spoilers.

I think this extension should do the same thing: create a <div> with the attribute string as a class by default, and allow an application-defined filter to turn it into whatever other element they want.

thanks for the information @notriddle

do you think this very specific use case could exist behind a more specific feature flag?

is there any obvious code issues you can see with this implementation?

pulldown-cmark/src/scanners.rs Outdated Show resolved Hide resolved
@jim-taylor-business
Copy link
Contributor Author

jim-taylor-business commented Jul 1, 2025

@notriddle - so if i make this PR more generic, similar to the markdown-it container extension, and implement the spoiler variation of the container extension, that would be acceptable?

@notriddle
Copy link
Collaborator

Personally, I think so, but @Martin1887 needs to approve, too.

@Martin1887
Copy link
Collaborator

In a first look it seems good, but I good like to review it carefully before merging. Thanks!

@jim-taylor-business
Copy link
Contributor Author

jim-taylor-business commented Jul 2, 2025

thanks for the reply @Martin1887

i think i would like to redesign it a bit to make it similar to the markdown-it container extension and implement spoilers as a specific variant of that so that it fits an existing standard as per @notriddle 's suggestion

does that seem like a good idea to you too?

@Martin1887
Copy link
Collaborator

Yes, it is a good approach, and container extensions would be under a feature flag.

@jim-taylor-business jim-taylor-business marked this pull request as draft July 2, 2025 11:39
@jim-taylor-business jim-taylor-business changed the title spoiler with summary, colon fenced spoiler with summary, markdown-it style container extensions Jul 2, 2025
@jim-taylor-business jim-taylor-business marked this pull request as ready for review July 2, 2025 17:53
@jim-taylor-business
Copy link
Contributor Author

i've refactored things as i hope we discussed @notriddle @Martin1887

please let me know if i need to make changes

pulldown-cmark/src/lib.rs Show resolved Hide resolved
@ollpu
Copy link
Collaborator

ollpu commented Jul 10, 2025

The "arbitrary kind" string (::: this) and how it can be extended seems very similar to #1013. We could try to find a uniform solution to both. (and finally get the block quote markers past the finish line as well)

I'll try to think about this more in the coming days.

@Martin1887
Copy link
Collaborator

Good point, thanks!

@jim-taylor-business
Copy link
Contributor Author

i have updated the PR to reflect all the changes suggested

@ollpu @Martin1887 @notriddle

with the exception of this test

::: block     

content

::: end

i'm ok with what this PR generates but i know you like to be compatible

@ollpu
Copy link
Collaborator

ollpu commented Aug 2, 2025

Cool, thanks!

Do you think you can make nesting work as well?

:::: a
::: b

:::
::::

gives

<div class="a">
<p>b</p>
</div>

while I expected nested divs

@jim-taylor-business
Copy link
Contributor Author

jim-taylor-business commented Aug 7, 2025

while investigating this bug i changed the container parsing logic significantly

@ollpu @Martin1887 @notriddle

i have added more tests and, of course, now everything passes

this PR does not support the ::: {foo,bar} syntax

any feedback welcome

pulldown-cmark/src/parse.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@ollpu ollpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and apologies again about the delay. This is starting to look quite robust, but there are still some issues. Nesting is tricky and I guess we don't have any other feature to take inspiration from directly.

Consider this:

::: container
> shouldn't close, right?
> :::
> x  
:::

<div class="container">
<blockquote>
<p>shouldn't close, right?</p>
</blockquote>
</div>
<blockquote>
<p>x
:::</p>
</blockquote>

Another thing we try to stay conscious of is time complexity, to make sure untrusted input cannot cause parsing to become too slow. The current implementation seems to be quadratic when nesting containers.

With input generated like this (replace N with some number):

python -c 'print("\n".join(":"*k + "x" for k in range(N, 2, -1)))'

If I double the value of N, the size of the input quadruples (more or less). However, the runtime seems to slow down by ~16x:

N time (s)
1000 0.32
2000 4.99
4000 85.80

That would indicate quadratic run time growth with respect to input size, which we generally want to avoid.

One solution to this is to limit the amount of nesting (or simply colons recognized, before saturating (similar to how {} are handled in math spans).

Don't stress about these too much. The runtime issue can definitely be fixed later. For now it would be enough to have something that works and is well specified by the spec tests. If nesting proves too difficult, we can also leave it out and try to get it working later.

pulldown-cmark/src/parse.rs Outdated Show resolved Hide resolved
@jim-taylor-business
Copy link
Contributor Author

thanks for the feedback @ollpu

will get stuck into it when i can 😃

@jim-taylor-business
Copy link
Contributor Author

hi @ollpu @Martin1887 @notriddle

tried to address everything mentioned by @ollpu

let me know what you think 😄

Copy link
Collaborator

@ollpu ollpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good work! I like the approach, but the container_depth counter is a pervasive change to firstpass.rs and I believe it's not really needed. Same with the depth field of ItemBody::Container. I'm still learning new things about the feature in commonmark.hs and others - apparently nesting can be done without increasing fence lengths too 😄

I gather that FirstPass.container_depth has two purposes: to enforce a nesting limit and to ensure that ::: can't close things outside their scope.

The first use case could be replaced by checking self.tree.spine_len(). That more or less provides the current container nesting level.

For the second use case, we can instead determine the scope from the ancestors when iterating self.tree.walk_spine().rev(). Whenever we encounter a node other than a Container or Paragraph, we can stop iterating. No container blocks that were opened outside that node should be closed inside it. I've implemented this idea in the first two code suggestions below - tests pass. The depth field of ItemBody::Container becomes unnecessary.

My suggestions also align the following test with commonmark-hs. A closing fence can close container blocks that were started outside other container blocks (but not other kinds of nodes):

::: a
:::: b
::::: c
::::
x

commonmark-hs:

<div class="a">
<div class="b">
<div class="c">

</div>
</div>
<p>x</p>
</div>

These tests are getting a little abstract aren't they...

pulldown-cmark/src/firstpass.rs Outdated Show resolved Hide resolved
pulldown-cmark/src/firstpass.rs Outdated Show resolved Hide resolved
pulldown-cmark/src/firstpass.rs Outdated Show resolved Hide resolved
let fence_length =
scan_ch_repeat(&bytes[(start_ix + line_start.bytes_scanned())..], b':');

if fence_length > u8::MAX as usize || self.container_depth == u8::MAX {
Copy link
Collaborator

@ollpu ollpu Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With regards to using u8 for the fence length, I think a behavior like this would be more consistent with other features:

If fence_length > 255, treat it as if it were only 255. So 255 colons is enough to close a container that was started with 500 colons. The feature still keeps working, but the count isn't checked beyond the limit.

Test case: python -c 'print(":"*256 + " x\n" + ":"*255)' should parse as a container without literal :s left over

IndentCodeBlock,
HtmlBlock,
BlockQuote(Option<BlockQuoteKind>),
Container(u8, u8, ContainerKind, CowIndex),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment describing what the fields mean? Or just make it a struct variant with named fields - I think we want to switch the others to that sooner or later as well.

Comment on lines 277 to 280
let fence_length =
scan_rev_while(&self.bytes[self.ix..(self.ix + nl_ix - eol_length)], |c| {
c == b':'
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not intentional that the closing fence doesn't need to be on its own line, but any other content from the beginning is discarded?

::: a
content :::

gives

<div class="a"></div>

Copy link
Contributor Author

@jim-taylor-business jim-taylor-business Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if i understand the spec correctly:

https://github.com/jgm/commonmark-hs/blob/master/commonmark-extensions/test/fenced_divs.md

It ends with a closing fence: a line beginning with at least as many consecutive : characters as in the opening fence, followed by optional whitespace and the end of the line.

so this will be closed by default and interpret the :s as text. i have added the test accordingly

Comment on lines 527 to 532
pub(crate) fn scan_nextline(bytes: &[u8]) -> usize {
memchr(b'\n', bytes).map_or(bytes.len(), |x| x + 1)
memchr(b'\n', bytes).map_or(memchr(b'\r', bytes).map_or(bytes.len(), |x| x + 1), |x| {
x + 1
})
}

Copy link
Collaborator

@ollpu ollpu Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is a bit scary. I think it reveals some bugs in the existing uses of the function, since CommonMark defines a lone carriage return \r as a valid line ending. And I don't think this change will make it work correctly in that case either, as it's just a fallback if no \n are found anywhere.

We should carefully evaluate the impact before making this change.

Does this fix something for you even when the input is all CRLF (\r\n)? Such as if you're given a slice where the \n has been stripped out, but not \r? In that case I would try to find another more local fix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #1056

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have reverted this change so it can be addressed properly as you suggest

@jim-taylor-business
Copy link
Contributor Author

amazing feedback @ollpu thanks!

These tests are getting a little abstract aren't they...

the more the merrier 😄

i have tried to implement everything you suggested

let me know if there's anything else

pulldown-cmark/src/firstpass.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@notriddle notriddle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pretty much looks good to me. The test case covers most of the possibilities, and the implementation fits in with the rest of the code.

Other than some low-hanging optimization fruit, it looks fine.

Co-authored-by: Michael Howell <michael@notriddle.com>
Copy link
Collaborator

@ollpu ollpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good now. Thanks for your continued efforts.

@ollpu ollpu changed the base branch from master to branch-0.14 October 30, 2025 18:00
@ollpu ollpu merged commit ee8237c into pulldown-cmark:branch-0.14 Oct 30, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.