topmark.pipeline.context.model¶
topmark / pipeline / context / model
Processing context model for the TopMark pipeline.
This module defines the core data structures used to represent the state of
a single file as it flows through the TopMark pipeline. The central type is
ProcessingContext, which
carries configuration, status, diagnostics, and view data between steps.
Sections
ProcessingContext: High-level container that represents the per-file processing state and exposes convenience helpers for policy checks, feasibility decisions, and view access.
HaltState: Small helper dataclass that records why and where processing was halted for a given file.
HaltState
dataclass
¶
Information about a terminal halt for a single file.
Instances of this dataclass describe why and where the pipeline
decided to stop processing a file. A non-empty step_name implies
that a step requested an early, graceful halt.
Attributes:
| Name | Type | Description |
|---|---|---|
reason_code |
str
|
Short machine-friendly reason code explaining
why processing was halted (for example, |
step_name |
str
|
Name of the pipeline step that requested the halt. An empty string indicates that no explicit halt has been recorded. |
ProcessingContext
dataclass
¶
ProcessingContext(
*,
config,
run_options,
path,
policy_registry,
timestamp=None,
steps=(lambda: [])(),
resolution_probe=None,
file_type=None,
status=ProcessingStatus(),
halt_state=None,
header_processor=None,
leading_bom=False,
has_shebang=False,
is_effectively_empty=False,
is_logically_empty=False,
newline_hist=(lambda: {})(),
dominant_newline=None,
dominance_ratio=None,
mixed_newlines=None,
newline_style="\n",
ends_with_newline=None,
pre_insert_capability=InsertCapability.UNEVALUATED,
pre_insert_reason=None,
pre_insert_origin=None,
diagnostics=MutableDiagnosticLog(),
diagnostic_hints=HintLog(),
views=Views(),
)
Context for header processing in the TopMark pipeline.
A ProcessingContext instance represents the complete, mutable state
for a single file as it flows through the pipeline. It holds configuration,
per-axis status, diagnostics, and view data, and it exposes helpers for
policy- and feasibility-related decisions.
Attributes:
| Name | Type | Description |
|---|---|---|
config |
FrozenConfig
|
Effective layered configuration for this file. |
run_options |
RunOptions
|
Invocation-wide execution-only runtime options for the current run. |
path |
Path
|
The file path to process (absolute or relative to the working directory). |
policy_registry |
PolicyRegistry
|
The policy registry (global + file type specific overrides). |
timestamp |
datetime | None
|
The file path's modification timestamp. This is distinct from
|
steps |
list[Step[ProcessingContext]]
|
Ordered list of pipeline steps that have been executed for this context. |
resolution_probe |
ResolutionProbeResult | None
|
Probe result explaining file type and processor resolution for the current file path. |
file_type |
FileType | None
|
Resolved file type for the file (for example, a Python or Markdown file type), if applicable. |
status |
ProcessingStatus
|
Aggregated status for each pipeline axis, kept as the single source of truth for per-axis outcomes. |
halt_state |
HaltState | None
|
Information about an early, terminal halt for this file. |
header_processor |
HeaderProcessor | None
|
Header processor instance responsible for this file type, if any. |
leading_bom |
bool
|
True if the original file began with a UTF-8 BOM ( |
has_shebang |
bool
|
True if the first logical line starts with |
is_effectively_empty |
bool
|
Whether the decoded, BOM-stripped text image contains no non-whitespace characters. Newlines and other whitespace are allowed. This is the broad notion of "empty" used for most policy decisions. |
is_logically_empty |
bool
|
Whether the file is "logically empty": after BOM stripping,
it contains optional horizontal whitespace and at most one trailing
newline sequence (LF/CRLF/CR), and nothing else. This is a stricter subset
of |
newline_hist |
dict[str, int]
|
Histogram of newline styles detected in the file image. |
dominant_newline |
str | None
|
Dominant newline sequence detected in the file (for example, |
dominance_ratio |
float | None
|
Ratio of dominant newline occurrences versus total newline occurrences. |
mixed_newlines |
bool | None
|
True if multiple newline styles were detected, False if a single style was found, or None if not evaluated yet. |
newline_style |
str
|
Normalized newline style used when writing output; defaults to |
ends_with_newline |
bool | None
|
True if the file ends with a newline sequence, False if it does not, or None if unknown. |
pre_insert_capability |
InsertCapability
|
Advisory from the sniffer about pre-insert checks (for example,
spacers or empty body), defaults to |
pre_insert_reason |
str | None
|
Human-readable reason why insertion may be problematic. |
pre_insert_origin |
str | None
|
Origin of the pre-insertion diagnostic (typically a step or subsystem name). |
diagnostics |
MutableDiagnosticLog
|
Collected diagnostics (info, warning, and error) produced during processing. |
diagnostic_hints |
HintLog
|
Non-binding hints supplied by steps to explain decisions; used primarily for summarization. |
views |
Views
|
Bundle that carries image/header/build/render/updated/ diff views for this file. The runner may prune heavy views after processing. |
is_empty_like
property
¶
Return True if the file contains no meaningful content.
This is True when the file is either: - physically empty on disk (0 bytes), or - effectively empty after decoding (only whitespace).
This helper is intended for convenience checks in pipeline steps and should not replace explicit emptiness distinctions in policy evaluation.
step_axes
property
¶
Map each executed step to the axes it may write.
The keys are step names (e.g. "SnifferStep"), and the values are
lists of axis names (e.g. ["fs", "content"]). This is derived from
the axes_written contract of each step instance in self.steps.
Combined with self.steps (execution order) and self.status.to_dict()
(per-axis final status), this provides a complete view of the
step/axis/status relationship without duplicating status payloads.
is_halted
property
¶
Return True if a step has requested an early halt for this file.
Returns:
| Type | Description |
|---|---|
bool
|
|
bool
|
the pipeline should not execute any further steps for this file. |
get_effective_policy ¶
Return the effective policy for this processing context.
The effective policy is derived from the global configuration and any
file-type-specific overrides via the shared
PolicyRegistry. This method
does not perform any merging at runtime; all policies are resolved at
MutableConfig.freeze() time.
Per-type policies are keyed by canonical qualified file type identifiers
such as topmark:python, not local identifiers such as python.
Returns:
| Type | Description |
|---|---|
FrozenPolicy
|
The effective policy for this context. |
Source code in src/topmark/pipeline/context/model.py
request_halt ¶
Request a graceful, terminal stop for the rest of the pipeline.
This method records a HaltState on the context so that subsequent
steps and the runner can avoid further processing for this file.
The context's halt_state field is updated in place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reason
|
str
|
Short machine-friendly reason code for halting the
pipeline (for example, |
required |
at_step
|
Step[ProcessingContext]
|
Step instance requesting the halt. |
required |
Source code in src/topmark/pipeline/context/model.py
iter_image_lines ¶
Iterate the current file image without materializing.
This accessor hides the underlying representation (list-backed, mmap-backed, or generator-based) and returns an iterator over logical lines with original newline sequences preserved.
Returns:
| Type | Description |
|---|---|
Iterable[str]
|
An iterator over the file's lines. If no image is present, |
Iterable[str]
|
an empty iterator is returned. |
Source code in src/topmark/pipeline/context/model.py
image_line_count ¶
Return the number of logical lines without materializing.
Returns:
| Type | Description |
|---|---|
int
|
Total number of lines in the current image, or |
Source code in src/topmark/pipeline/context/model.py
iter_updated_lines ¶
Iterate the updated file image lines, if present.
Returns:
| Type | Description |
|---|---|
Iterable[str]
|
Iterator over updated lines. If no updated image is available (no planner/strip output), |
Iterable[str]
|
returns an empty iterator. |
Source code in src/topmark/pipeline/context/model.py
materialize_image_lines ¶
Return the original file image as a materialized list of lines.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of logical lines from the current image view. An empty list is returned if no image |
list[str]
|
is available. |
Source code in src/topmark/pipeline/context/model.py
materialize_updated_lines ¶
Return the updated file image as a materialized list of lines.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of updated lines if present, otherwise an empty list. |
Source code in src/topmark/pipeline/context/model.py
to_dict ¶
Return a machine-readable representation of this processing result.
The schema is intended for CLI/CI consumption and avoids color or
formatting concerns. View details are delegated to
self.views.as_dict() to keep this method small and consistent with
the Views bundling.
Returns:
| Type | Description |
|---|---|
dict[str, object]
|
A JSON-serializable mapping describing the context, including path, file type, |
dict[str, object]
|
step statuses, views summary, diagnostics, and high-level outcome flags. |
Source code in src/topmark/pipeline/context/model.py
hint ¶
Create and attach a normalized Hint to this context.
This is a convenience façade around make_hint and HintLog.add,
allowing pipeline steps to emit structured, non-binding diagnostics
without depending on the underlying HintLog representation.
The new hint is appended to this context's hint log.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
axis
|
Axis
|
Axis emitting the hint. |
required |
code
|
KnownCode | str
|
Stable machine key for the condition. |
required |
message
|
str
|
Human-readable short summary line. |
required |
detail
|
str | None
|
Optional extended diagnostic text rendered at higher verbosity (e.g., multi-line config snippets or rationale). |
None
|
cluster
|
Cluster | str | None
|
Optional grouping key; defaults to |
None
|
terminal
|
bool
|
Whether this condition is terminal. |
False
|
reason
|
str | None
|
Optional detail string. |
None
|
meta
|
dict[str, object] | None
|
Optional extensibility bag. |
None
|
Source code in src/topmark/pipeline/context/model.py
bootstrap
classmethod
¶
Create a fresh context with no derived state.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
File system path for the file to process. |
required |
config
|
FrozenConfig
|
Effective layered configuration to attach to the context. |
required |
run_options
|
RunOptions
|
Invocation-wide execution-only runtime options. |
required |
policy_registry_override
|
PolicyRegistry | None
|
Optional precomputed policy registry for the
supplied effective config. When omitted, the registry is derived
from |
None
|
Returns:
| Type | Description |
|---|---|
ProcessingContext
|
Newly created context instance. |