Skip to content

topmark.pipeline.hints

topmark / pipeline / hints

Hint taxonomy and normalization utilities for the TopMark pipeline.

This module defines the canonical vocabulary and structure for diagnostic hints emitted by pipeline steps, as well as helpers for hint aggregation, ranking, and selection. Hints are lightweight, non-binding messages that help explain intermediate conditions or advisory diagnostics (e.g., "would insert header", "file contains mixed line endings"). They are used for telemetry, CLI feedback, and public API inspection but do not influence control flow directly.

Overview

Axis - enumerates stable pipeline axes that can emit hints. • KnownCode - curated list of common, machine-friendly hint codes. • Hint - dataclass capturing a normalized hint payload. • make_hint - factory helper to create validated, consistent Hint objects. • HintLog - mutable container for per-context hints with convenience helpers. • select_headline_hint - ranking helper that selects the most relevant hint.

Design principles
  • Hints are diagnostic only; they never alter processing behavior.
  • Axis values are stable and map 1:1 to pipeline status axes.
  • KnownCode covers frequent cases but is not exhaustive; ad-hoc string codes are allowed.
  • ProcessingContext.add_hint stores normalized hints to simplify aggregation, ranking (headline selection), and coarse bucketing.
  • HintLog and select_headline_hint centralize ranking and aggregation logic.

Axis

Bases: EnumIntrospectionMixin, str, Enum

Canonical axes that can emit hints.

Values are short, machine-friendly strings that align with pipeline status axes. These are intentionally stable; prefer adding new codes over adding new axes.

Members

RESOLVE: File type resolution phase. FS: File system checks (existence, permissions, binary, newline mix). CONTENT: Text content/read phase (encoding, policy skips). HEADER: Header scan/parse phase (presence, bounds, malformed). GENERATION: Field generation/build phase. RENDER: Header rendering phase. COMPARISON: Content/format comparison phase. STRIP: Header removal phase. PLAN: Plan phase (update planning) (insert/replace/remove). PATCH: Patch generation phase (unified diff text). WRITE: Final write/emit phase (file/stdout/preview).

Notes

Axis values should change only in major versions. When new situations arise, add or reuse codes under an existing axis before introducing a new axis.

Cluster

Bases: EnumIntrospectionMixin, StyledStrEnum

Coarse, outcome-oriented groups for hints.

Clusters provide a small set of semantically meaningful buckets used for CLI headline ranking and mapping to StyleRole for renderers (e.g., ERROR outranks UNCHANGED). Renderers should map StyleRole to concrete styling (color, bold, etc.).

Accessibility

Do not rely solely on color to convey meaning. Always include a textual label (e.g., "changed", "skipped").

KnownCode

Bases: EnumIntrospectionMixin, str, Enum

Illustrative, non-exhaustive set of codes.

Codes are short, namespaced, and machine-friendly. Treat this enum as a convenient source of well-known codes used across TopMark. Do not require all codes to live here-Hint accepts arbitrary strings so extensions and experiments remain frictionless.

Examples (selected): discovery/resolve: 'discovery:unsupported', 'discovery:no_processor'

fs:               'fs:mixed_newlines', 'fs:bom_before_shebang'

content:          'content:skipped_bom_shebang', 'content:skipped_mixed'

header:           'header:detected', 'header:missing', 'header:empty', 'header:malformed'

generation:       'generation:no_fields', 'generation:generated'

render:           'render:rendered'

comparison:       'compare:changed', 'compare:unchanged'

strip:            'strip:ready', 'strip:none', 'strip:failed'

plan:             'plan:insert', 'plan:update', 'plan:remove', 'plan:skip'

patch:            'patch:generated', 'patch:skipped', 'patch:failed'

write:            'write:written', 'write:previewed', 'write:skipped', 'write:failed'
Notes

Additions here are source-compatible. Removals or renames are breaking and should only occur in major releases.

Hint dataclass

Hint(
    *,
    axis,
    code,
    message,
    detail=None,
    cluster=None,
    terminal=False,
    reason=None,
    meta=None,
)

Normalized hint payload attached to a ProcessingContext.

Attributes:

Name Type Description
axis Axis

Pipeline axis emitting the hint (e.g., Axis.FS).

code str

Stable machine key (e.g., 'fs:mixed_newlines', 'plan:insert'). Accepts any string; prefer KnownCode members when available.

message str

Short, summary line suitable for single-line CLI output.

detail str | None

Optional extended diagnostic text (possibly multi-line) that callers may render only at higher verbosity levels.

cluster str | None

Optional broader grouping key for bucketing/analytics. Defaults to code when omitted.

terminal bool

Whether the condition represents a terminal/stop state.

reason str | None

Optional additional detail (status value, policy id).

meta dict[str, object] | None

Optional extensibility bag (free-form).

Example
from topmark.pipeline.hints import Axis, KnownCode, make_hint

hint = make_hint(
    axis=Axis.FS,
    code=KnownCode.FS_MIXED_NEWLINES,
    message="File contains mixed line endings; policy may allow proceeding",
    terminal=False,
)
ctx.add_hint(hint)

HintLog dataclass

HintLog(*, items=(lambda: [])())

Mutable, per-context collection of diagnostic hints.

This wrapper keeps hint aggregation and logging concerns local and provides a small façade (add, headline) so that callers do not depend on the concrete list representation.

add

add(hint)

Attach a structured hint to the hint log.

Hints are non-binding diagnostics used to refine human-readable summaries without affecting the core outcome classification.

The hint collection is updated in place.

Parameters:

Name Type Description Default
hint Hint

The hint instance to attach.

required
Source code in src/topmark/pipeline/hints.py
def add(self, hint: Hint) -> None:
    """Attach a structured hint to the hint log.

    Hints are non-binding diagnostics used to refine human-readable
    summaries without affecting the core outcome classification.

    The hint collection is updated in place.

    Args:
        hint: The hint instance to attach.
    """
    self.items.append(hint)
    if logger.isEnabledFor(logging.DEBUG):
        logger.debug(
            "added hint axis=%s code=%s message=%s", hint.axis, hint.code, hint.message
        )

make_hint

make_hint(
    *,
    axis,
    code,
    message,
    detail=None,
    cluster=None,
    terminal=False,
    reason=None,
    meta=None,
)

Create a normalized Hint with light validation and defaults.

Accepts either a KnownCode or an arbitrary string for code. If cluster is not provided, it defaults to the string form of code. When a Cluster is supplied for cluster, its string value is used.

Parameters:

Name Type Description Default
axis Axis

Axis emitting the hint.

required
code KnownCode | str

Stable machine key for the condition.

required
message str

Human-readable short summary line.

required
detail str | None

Optional extended diagnostic text rendered at higher verbosity (e.g., multi-line config snippets or rationale).

None
cluster Cluster | str | None

Optional grouping key; defaults to code.

None
terminal bool

Whether this condition is terminal.

False
reason str | None

Optional detail string.

None
meta dict[str, object] | None

Optional extensibility bag.

None

Returns:

Type Description
Hint

Frozen, normalized hint object.

Example
make_hint(axis=Axis.PLAN, code=KnownCode.PLAN_INSERT, message="would insert header")
Source code in src/topmark/pipeline/hints.py
def make_hint(
    *,
    axis: Axis,
    code: KnownCode | str,
    message: str,
    detail: str | None = None,
    cluster: Cluster | str | None = None,
    terminal: bool = False,
    reason: str | None = None,
    meta: dict[str, object] | None = None,
) -> Hint:
    """Create a normalized `Hint` with light validation and defaults.

    Accepts either a `KnownCode` or an arbitrary string for ``code``. If
    ``cluster`` is not provided, it defaults to the string form of ``code``. When
    a `Cluster` is supplied for ``cluster``, its string value is used.

    Args:
        axis: Axis emitting the hint.
        code: Stable machine key for the condition.
        message: Human-readable short summary line.
        detail: Optional extended diagnostic text rendered at higher
            verbosity (e.g., multi-line config snippets or rationale).
        cluster: Optional grouping key; defaults to ``code``.
        terminal: Whether this condition is terminal.
        reason: Optional detail string.
        meta: Optional extensibility bag.

    Returns:
        Frozen, normalized hint object.

    Example:
        ```python
        make_hint(axis=Axis.PLAN, code=KnownCode.PLAN_INSERT, message="would insert header")
        ```
    """
    code_str: str = code.value if isinstance(code, KnownCode) else str(code)
    cluster_str: str = (
        (cluster.value if isinstance(cluster, Cluster) else cluster)
        if cluster is not None
        else code_str
    )

    return Hint(
        axis=axis,
        code=code_str,
        message=message,
        detail=detail,
        cluster=cluster_str,
        terminal=terminal,
        reason=reason,
        meta=meta,
    )