topmark.pipeline.views¶
View abstractions for large, phase-scoped pipeline data.
This module defines lightweight, typed "views" that expose file and header data without committing to a concrete backing representation. Implementations can be list-backed today and evolve to memory-mapped or generator-based forms later, while keeping step contracts stable and memory usage low.
The views are intentionally minimal: callers iterate or count lines instead of materializing whole images, and rich blocks/mappings are grouped in small dataclasses per phase.
Releasable ¶
Bases: Protocol
Protocol for views that can release large in-memory buffers.
This optional lifecycle hook allows memory-heavy views to discard their
materialized state (e.g., lists of lines) after downstream steps no longer
need them. The pipeline runner invokes release() when pruning is enabled
to keep peak memory usage low.
Implementers should make release() idempotent: calling it multiple times
must be safe and should not raise. Views that do not hold large buffers can
implement a no-op release() to satisfy the protocol if needed.
Examples:
ListFileImageViewclears its backinglist[str]reference.- A memory-mapped view could close or unmap its file handle.
FileImageView ¶
Bases: Releasable, Protocol
Protocol for read-only access to a file's logical lines.
FileImageView extends Releasable so implementations must provide
a release() method that frees materialized buffers when called by the
pipeline runner during pruning.
line_count ¶
Return the number of logical lines in the file image.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
Total number of lines available via |
iter_lines ¶
Iterate the file's logical lines, preserving original line endings.
Returns:
| Type | Description |
|---|---|
Iterable[str]
|
Iterable[str]: An iterator over the file's lines. The iterator |
Iterable[str]
|
must yield strings exactly as they appear in the source (e.g., |
Iterable[str]
|
with |
Source code in src/topmark/pipeline/views.py
ListFileImageView
dataclass
¶
List-backed FileImageView implementation (and Releasable).
This view wraps an in-memory list[str] where each element represents a
logical line including its original newline sequence (keepends semantics).
Calling release discards the backing list to free memory; subsequent
iteration yields an empty sequence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lines
|
list[str]
|
Source lines to expose. The list is not copied; the caller retains ownership and must not mutate it while the view is used. |
required |
Source code in src/topmark/pipeline/views.py
HeaderView
dataclass
¶
Bases: Releasable
Structured view of the existing header detected by the scanner.
Attributes:
| Name | Type | Description |
|---|---|---|
range |
tuple[int, int] | None
|
Inclusive |
lines |
Sequence[str] | None
|
Header lines exactly as found (keepends), or |
block |
str | None
|
Concatenated header text ( |
mapping |
Mapping[str, str] | None
|
Parsed field mapping extracted from the header, or |
success_count |
int
|
The number of header lines that were successfully parsed and added to the
|
error_count |
int
|
The number of header lines that were malformed (e.g., missing a colon, or having an empty field name). Defaults to 0. |
BuilderView
dataclass
¶
Bases: Releasable
Structured view of field dictionaries produced by the builder step.
Attributes:
| Name | Type | Description |
|---|---|---|
builtins |
Mapping[str, str] | None
|
Derived built-in fields (e.g., file, relpath). |
selected |
Mapping[str, str] | None
|
The subset (and overrides) selected for rendering, aligned with the
configuration's |
Notes
The contained mappings are exposed read-only through abstract mapping
types. Calling release() clears the references to allow pruning.
RenderView
dataclass
¶
Bases: Releasable
Structured view of the expected header produced by the renderer.
Attributes:
| Name | Type | Description |
|---|---|---|
lines |
Sequence[str] | None
|
Rendered header lines (keepends), or |
block |
str | None
|
Concatenated rendered header text, or |
Notes
Large buffers may be pruned by calling release(), which clears
lines and block.
UpdatedView
dataclass
¶
Bases: Releasable
View of the pipeline's updated file image.
lines may be a sequence (e.g., list[str]) or a lazy iterable
(e.g., a generator composing a three-segment view) to avoid materializing
large buffers up-front.
Attributes:
| Name | Type | Description |
|---|---|---|
lines |
Sequence[str] | Iterable[str] | None
|
Updated file image as a sequence or iterable of lines, or |
Notes
Pruning is handled by calling release(), which clears the updated file
image reference. If lines is an iterator, callers must treat this view
as single-pass.
DiffView
dataclass
¶
Bases: Releasable
Unified diff view for CLI/CI consumption.
Attributes:
| Name | Type | Description |
|---|---|---|
text |
str | None
|
Unified diff as a single string, or |
Notes
Pruning is done by calling release(), which nulls text to free memory.
Views
dataclass
¶
Bundle of phase-scoped, releasable views for a single file.
Notes
The bundle itself provides release_all() to prune memory after a run.
Individual views remain responsible for their own release() behavior.
release_all ¶
Release all non-None views safely (idempotent).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
keep_diff_view
|
bool
|
Whether to preserve the diff view. |
False
|
Source code in src/topmark/pipeline/views.py
as_dict ¶
Short machine-friendly summary; avoid heavy text blobs.