Skip to content

topmark.processors.builtins.markdown

topmark / processors / builtins / markdown

Processor for files with HTML/XML-style block comments ().

This processor treats Markdown as a line-based format that happens to support HTML comment blocks (<!-- ... -->) for the TopMark header. It does not use XML positional logic or prolog/doctype handling; headers are inserted at the top of the document (line-based strategy).

Markdown-specific behavior
  • Uses HTML block comments (<!-- ... -->) as the header block wrapper.
  • Inserts the TopMark header at the logical top of the file (index 0) so it precedes any existing banner comments or headings.
  • Ignores TopMark markers that appear inside fenced code blocks (``` / ~~~), even if they look like partial or malformed headers. This keeps README examples from being treated as real headers and allows Builder/Planner to insert a header at the top when appropriate.

MarkdownHeaderProcessor

MarkdownHeaderProcessor()

Bases: BlockCommentMixin, HeaderProcessor

Header processor for Markdown formats (HTML comment-based, line-oriented).

This processor uses <!-- ... --> block comments for the TopMark header and relies on the line-based placement strategy from HeaderProcessor. It adds Markdown-specific safeguards to ignore header-like markers that appear inside fenced code blocks (``` or ~~~).

Source code in src/topmark/processors/builtins/markdown.py
def __init__(self) -> None:
    # Use HTML-style block comment delimiters for the header block.
    super().__init__(
        block_prefix="<!--",
        block_suffix="-->",
    )

get_header_bounds

get_header_bounds(*, lines, newline_style)

Detect the TopMark header in Markdown, ignoring fenced code blocks.

This override delegates to the base header-bounds logic after hiding any TopMark markers that appear inside fenced code blocks (``` or ~~~). Markers in fenced regions are treated as plain text so README examples do not affect header detection.

Source code in src/topmark/processors/builtins/markdown.py
def get_header_bounds(
    self,
    *,
    lines: Iterable[str],
    newline_style: str,
) -> HeaderBounds:
    """Detect the TopMark header in Markdown, ignoring fenced code blocks.

    This override delegates to the base header-bounds logic after hiding any
    TopMark markers that appear inside fenced code blocks (``` or ~~~). Markers
    in fenced regions are treated as plain text so README examples do not affect
    header detection.
    """
    # Materialize once to avoid exhausting generators multiple times.
    buf: list[str] = list(lines)
    if not buf:
        return HeaderBounds(kind=BoundsKind.NONE)

    # Compute fence mask on the concrete buffer.
    fence_mask: list[bool] = self._compute_fence_mask(buf)

    # Create a filtered copy where markers inside fences are removed so the
    # base get_header_bounds logic never sees them.
    filtered: list[str] = list(buf)
    for i, ln in enumerate(buf):
        if fence_mask[i]:
            filtered[i] = ln.replace(TOPMARK_START_MARKER, "").replace(TOPMARK_END_MARKER, "")

    # Delegate to the base logic using the filtered view. Indices in the
    # returned HeaderBounds still correspond to the original buffer because
    # we preserved length and line ordering.
    return super().get_header_bounds(lines=filtered, newline_style=newline_style)

prepare_header_for_insertion

prepare_header_for_insertion(
    *,
    original_lines,
    insert_index,
    rendered_header_lines,
    newline_style,
)

Ensure a single blank line after the header block in Markdown.

  • Insert at top of file (index 0) with no leading blank.
  • If body content follows and the first body line is not blank, append exactly one blank line after the header block.
Source code in src/topmark/processors/builtins/markdown.py
def prepare_header_for_insertion(
    self,
    *,
    original_lines: list[str],
    insert_index: int,
    rendered_header_lines: list[str],
    newline_style: str,
) -> list[str]:
    """Ensure a single blank line after the header block in Markdown.

    - Insert at top of file (index 0) with no leading blank.
    - If body content follows and the first body line is *not* blank,
      append exactly one blank line after the header block.
    """
    out: list[str] = list(rendered_header_lines)

    # If there is body content at the insert site, ensure a blank between
    # the header block and that content.
    if insert_index < len(original_lines):
        next_line: str = original_lines[insert_index]
        if next_line.strip() != "":
            # Add one blank line after the block
            out.append(newline_style)

    return out