topmark.processors.mixins¶
topmark / processors / mixins
Common mixins for header processors.
These mixins provide reusable, state-light behavior for:
- Line-comment based processors (e.g., Pound/Slash) via
LineCommentMixin. - Positional, tag- or prolog-sensitive processors (e.g., XML/HTML) via
XmlPositionalMixin. - Shebang-aware insertion rules via
ShebangAwareMixin.
They do not change public behavior on their own. Processors can adopt these
mixins to share well-tested logic and reduce duplication. Delimiter attributes
such as line_prefix and block_prefix remain normal instance attributes so
processor instances may override them during construction or registration.
ShebangAwareMixin ¶
Utilities for shebang-aware insertion anchors.
Notes
These helpers operate on line-oriented content. Processors that manage
character offsets should translate as needed (e.g., splitlines(True)).
LineCommentMixin ¶
Bases: ShebangAwareMixin
Shared helpers for line-comment based processors.
Processors define line-comment delimiter attributes, usually during
construction:
* line_prefix: the comment introducer for a header line (e.g., #).
* line_suffix: optional trailing comment portion to append (e.g., */).
Methods here centralize header line normalization, scanning, and safe insertion point computation (shebang aware).
is_header_line ¶
Return True if the line begins with the configured comment prefix.
strip_line_prefix ¶
Remove a single leading comment prefix; return original if absent.
render_header_line ¶
Render a header line with prefix (+ optional suffix).
The caller is responsible for appending a newline when assembling multiple lines.
Source code in src/topmark/processors/mixins.py
find_insertion_index ¶
Determine where a header should be inserted for line-comment files.
Behavior
- If a FileTypeHeaderPolicy is attached to this processor's file type
and explicitly sets
supports_shebang=False, do not skip a leading shebang even if present. - Otherwise, skip a leading shebang at line 0.
- If the policy declares an
encoding_line_regex, and a shebang was skipped, also skip a single encoding line immediately following.
Source code in src/topmark/processors/mixins.py
prepare_header_for_insertion ¶
prepare_header_for_insertion(
*,
original_lines,
insert_index,
rendered_header_lines,
newline_style,
)
Apply context-aware padding around the header for line-comment styles.
Design goals
- Preserve user whitespace verbatim. In particular, if the first body
line is whitespace-only (e.g.,
" \\n"), do not rewrite or collapse it to an exact blank. - Add at most one exact blank separator (
newline_style) that we own after the header only if body content follows and the next line is not already an exact blank. Never add a spacer at EOF.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_lines
|
list[str]
|
Original file lines (keepends=True). |
required |
insert_index
|
int
|
Line index where the header will be inserted. |
required |
rendered_header_lines
|
list[str]
|
Header lines to insert (keepends=True). |
required |
newline_style
|
str
|
Newline style ( |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
list[str]: Possibly modified header lines including any added padding. |
Source code in src/topmark/processors/mixins.py
BlockCommentMixin ¶
Shared helpers for block-comment processors (e.g., CSS/JS C-style).
Processors define block-comment delimiter attributes, usually during
construction:
* block_prefix: the opening delimiter (e.g., /* or <!--).
* block_suffix: the closing delimiter (e.g., */ or -->).
The helpers here are intentionally minimal; they can be expanded as we migrate concrete processors and spot duplication opportunities.
is_block_prefix ¶
Return True if line is block prefix, ignoring spaces/tabs and EOLs.
Returns True if line equals the configured block prefix,
ignoring only spaces/tabs and EOLs.
Affix equality ignores incidental surrounding spaces;
blank collapsing is not performed here. We intentionally do not use
str.strip() because it removes all Unicode whitespace (e.g., form-feed),
which should remain significant for affix equality.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
line
|
str
|
The line to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if |
Source code in src/topmark/processors/mixins.py
is_block_suffix ¶
Return True if line is block suffix, ignoring spaces/tabs and EOLs.
Returns True if line equals the configured block suffix,
ignoring only spaces/tabs and EOLs.
Affix equality ignores incidental surrounding spaces;
blank collapsing is not performed here. We intentionally do not use
str.strip() because it removes all Unicode whitespace (e.g., form-feed),
which should remain significant for affix equality.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
line
|
str
|
The line to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if |
Source code in src/topmark/processors/mixins.py
render_block_line ¶
ensure_block_padding ¶
Ensure the block text ends with a newline.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rendered_lines
|
list[str]
|
Lines that compose the block (including delimiters). |
required |
newline
|
str
|
Newline string to enforce at the end ("\n" or "\r\n"). |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
Possibly adjusted copy with a trailing newline present. |
Source code in src/topmark/processors/mixins.py
XmlPositionalMixin ¶
Helpers for tag-sensitive (positional) processors like XML/HTML.
This mixin offers small, composable predicates and insertion index logic that respect XML declarations and document type declarations (DOCTYPE).
is_xml_declaration ¶
is_doctype_declaration ¶
is_html_comment_open ¶
find_xml_insertion_index ¶
Return the line index after XML declaration and DOCTYPE (if present).
Notes
BOM handling is performed upstream in the reader step; this helper assumes lines are already normalized. The check is purely line-based and does not attempt to coalesce declaration/content that share a single line (the XML processor's char-offset path covers that case).
Source code in src/topmark/processors/mixins.py
compute_insertion_anchor ¶
Line-based fallback: place header after XML decl and DOCTYPE (if present).
prepare_header_for_insertion_text ¶
prepare_header_for_insertion_text(
*,
original_text,
insert_offset,
rendered_header_text,
newline_style,
)
Adjust whitespace so the header block sits on its own lines.
Blank detection here applies the configured policy in a text
(char-offset) path:
- Leading spacer is added only when inserting after some preamble and the
previous character is not already an EOL.
- Trailing spacer is added only when body content follows and the next
slice up to the next EOL is not a policy-blank (checked via
is_pure_spacer on the slice).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_text
|
str
|
Full file content as a single string. |
required |
insert_offset
|
int
|
0-based character offset where the header will be inserted. |
required |
rendered_header_text
|
str
|
Header block text (may already include newlines). |
required |
newline_style
|
str
|
Newline style ( |
required |
Returns:
| Type | Description |
|---|---|
str
|
Possibly modified header text to splice at |
Source code in src/topmark/processors/mixins.py
prepare_header_for_insertion ¶
prepare_header_for_insertion(
*,
original_lines,
insert_index,
rendered_header_lines,
newline_style,
)
Ensure the block itself ends with a newline, no extra spacer at EOF.
For XML/HTML-like processors that also support line-based insertion, we only guarantee the block terminates with the dominant newline; we do not add a trailing spacer when inserting at EOF (that's handled by the text path or upstream policy).
Notes
Blank detection is policy-aware (STRICT/UNICODE/NONE) via is_pure_spacer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_lines
|
list[str]
|
Original file lines. |
required |
insert_index
|
int
|
Line index where the header will be inserted. |
required |
rendered_header_lines
|
list[str]
|
Header lines to insert. |
required |
newline_style
|
str
|
Newline style ( |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
Possibly modified header lines including any added padding. |