Skip to content

topmark.processors.types

topmark / processors / types

Type definitions for the pipeline processing layer.

This module provides structured type definitions, such as dataclass objects, used to pass data between the pipeline's distinct phases. These types improve the clarity and type safety of complex return values compared to using bare tuples or dictionaries.

HeaderParseResult dataclass

HeaderParseResult(
    *, fields=(lambda: {})(), success_count=0, error_count=0
)

Result of parsing key-value fields from a header block.

This dataclass provides a structured and type-safe alternative to a bare return tuple, ensuring that consuming code can access the parsed data and metrics by name. The initializer requires all arguments to be passed by keyword.

Attributes:

Name Type Description
fields dict[str, str]

Mapping of all successfully parsed header fields (key → value). Defaults to an empty dictionary.

success_count int

The number of header lines that were successfully parsed and added to the fields dictionary. Defaults to 0.

error_count int

The number of header lines that were malformed (e.g., missing a colon, or having an empty field name). Defaults to 0.

BoundsKind

Bases: Enum

Discriminant for header-bound detection results.

Members

SPAN: A valid header span was found. MALFORMED: Header markers exist, but their shape is invalid (e.g., only end, only start, multiple starts/ends, or end before start). NONE: No header markers were detected.

HeaderBounds dataclass

HeaderBounds(*, kind, start=None, end=None, reason=None)

Structured result for header-bound detection.

This is a discriminated union controlled by kind:

  • When kind is BoundsKind.SPAN:
    • start and end are required (0-based line indexes).
    • start is inclusive, end is exclusive (slice-friendly).
    • reason is unused (None).
  • When kind is BoundsKind.MALFORMED:
    • start/end MAY be provided to pinpoint the offending region (best-effort; if unknown, they can be None).
    • reason SHOULD explain the malformed shape (e.g., "end without start").
  • When kind is BoundsKind.NONE:
    • No markers were detected; start/end/reason are None.

Attributes:

Name Type Description
kind BoundsKind

Discriminant of the result.

start int | None

Start line index (inclusive) when a span is available.

end int | None

End line index (exclusive) when a span is available.

reason str | None

Human-readable reason when kind is MALFORMED.

StripDiagKind

Bases: Enum

Outcome classification for header stripping operations.

Members

REMOVED: A header was found and removed successfully. NOT_FOUND: No header was detected; no changes made. MALFORMED_REFUSED: Malformed header markers detected; removal refused by policy. MALFORMED_REMOVED: Malformed markers detected but removal performed (if policy allows). NOOP_EMPTY: File effectively empty; nothing to remove. ERROR: Unexpected error encountered; no changes made.

StripDiagnostic dataclass

StripDiagnostic(
    *,
    kind,
    reason=None,
    removed_span=None,
    notes=list[str](),
)

Diagnostic payload describing a strip attempt.

Attributes:

Name Type Description
kind StripDiagKind

High-level outcome classification.

reason str | None

Optional human-readable explanation (e.g., policy gate or malformed reason).

removed_span tuple[int, int] | None

Inclusive (start, end) span of the removed header in the original input; present only when a header was actually removed.

notes list[str]

Additional details for logging or user-facing hints.

StripHeaderResult dataclass

StripHeaderResult(*, lines, removed_span, diagnostic)

Result of attempting to remove a TopMark header from file lines.

Attributes:

Name Type Description
lines list[str]

Updated file lines. This is the original line list when no header was removed or when removal was refused.

removed_span tuple[int, int] | None

Inclusive (start, end) line span of the removed header in the original input, or None when no header was removed.

diagnostic StripDiagnostic

Diagnostic payload describing the strip attempt outcome.