Road to TopMark 1.0¶
Purpose¶
This page records the curated stabilization and contract-freeze history that led to TopMark 1.0. It extracts the historical achievement narrative from the active 1.0 roadmap so the roadmap can remain focused on stable 1.x governance, explicit deferrals, and post-1.0 evolution.
The 0.12 development line and the 1.0 alpha/beta series transformed TopMark from a set of evolving header-processing components into a contract-oriented CLI and Python library with explicit runtime boundaries, deterministic processing, documented output contracts, strict typing, and reproducible CI/release automation.
This page is historical and explanatory. It records the stabilization and contract-freeze work that produced the stable 1.0 release line.
For current stable 1.x governance, post-1.0 deferrals, and future evolution tracking, see TopMark 1.0 Roadmap.
Stabilization overview¶
The stabilization effort focused on separating responsibilities that had previously become tightly entangled:
- TOML parsing and validation;
- layered configuration resolution;
- runtime execution intent;
- file discovery and file-type resolution;
- pipeline processing;
- human presentation;
- machine-readable output;
- and CI/release automation.
By the end of the beta line, TopMark was aligned around this model:
The main outcomes were:
- explicit registry and binding architecture;
- namespace-aware file-type identities;
- stable CLI behavior and exit-code contracts;
- separated human and machine output systems;
- staged configuration-validation internals with flattened compatibility diagnostics at public boundaries;
- deterministic pipeline semantics;
- documented public/internal API boundaries;
- strict documentation and prose-governance tooling;
- artifact-based release publication;
- and canonical CI coverage/reporting workflows.
Registry and resolution redesign¶
The registry system was redesigned into a deterministic, explicit, namespace-aware model.
The final architecture separates:
filetypesfor file-type identity;processorsfor processor identity;bindingsfor relationships between file types and processors.
Implicit registration through decorators or bootstrap scanning was replaced with explicit bindings. This made registry behavior predictable, testable, and plugin-ready without import-time side effects.
Legacy bootstrap-oriented entry points such as:
topmark.processors.bootstrap;topmark.processors.registry;topmark.registry.resolver;register_all_processors();Registry.ensure_processors_registered().
were removed during stabilization in favor of explicit registration and binding ownership.
The file-type identifier contract was frozen around qualified keys:
- canonical identities use
<namespace>:<name>, such astopmark:python; - local identifiers, such as
python, are accepted only when unambiguous; - ambiguous local identifiers require the qualified form;
- fuzzy matching, aliases, and implicit namespace fallback are intentionally unsupported.
Resolution was moved into topmark.resolution.*, and probe-backed resolution became the shared
explainability/evidence model for diagnostics and effective pipeline resolution. The read-only
topmark probe command and topmark.api.probe() API were accepted as the supported 1.0 resolution
explainability surfaces. They expose stable resolution DTOs without requiring callers to consume
registry internals, resolver enums, or pipeline contexts.
Result: registry and resolution behavior became explicit, deterministic, explainable, and ready for future plugin work.
CLI and presentation architecture¶
The CLI and presentation layers were separated into distinct responsibilities:
topmark.presentation.*renders pure output without I/O or Click;topmark.cli.console.*owns console/runtime concerns;topmark.cli.commands.*orchestrates commands.
The old emitter-oriented shape was replaced by a clearer distinction between serializers for machine output and renderers for human output. Rendering is Click-free, testable, and reusable outside CLI commands.
The final human-output contract is:
- TEXT output is console-oriented and owns verbosity/quiet behavior;
- Markdown output is document-oriented and ignores TEXT-oriented controls;
- JSON and NDJSON output are machine-readable and ignore human presentation controls.
The CLI command-applicability and exit-code contracts were implemented, tested, and documented. This
included dry-run WOULD_CHANGE behavior, missing-file handling, unmatched-glob semantics,
configuration-validation failures, probe semantic outcomes, and mixed-result priority ordering.
The path-command STDIN contract was also frozen:
- content STDIN uses the POSIX-style
-PATH sentinel plus--stdin-filename; - TopMark intentionally does not expose a
--stdinoption flag; - unsupported
--stdinspellings are rejected as CLI usage errors.
Result: CLI behavior became stable, explicit, scriptable, and separated from presentation internals.
Configuration and runtime separation¶
Configuration and execution intent were split into separate layers.
The final model separates:
topmark.tomlfor TOML parsing and whole-source validation;topmark.configfor layered configuration merge and effective runtime configuration resolution;topmark.runtimefor execution-time behavior.
Runtime-only behavior, such as apply/preview and STDIN handling, moved into RunOptions instead of
layered configuration.
Several user-facing policy/configuration behaviors were also normalized during stabilization:
add_only/update_onlywere replaced byheader_mutation_mode;skip_compliant/skip_unsupportedwere replaced byreport;policy_by_type,include_file_types, andexclude_file_typesnow share the same qualified/local file-type identifier contract.
Runtime-facing TOML sections such as [writer] remain outside layered configuration while still
appearing in config dumps, config checks, and machine-readable configuration snapshots.
Configuration validation was hardened through staged internal validation logs:
- TOML-source diagnostics;
- merged-config diagnostics;
- runtime-applicability diagnostics.
Public compatibility surfaces continue to expose flattened diagnostics at exception, presentation, and machine-readable output boundaries.
Broader staged-validation exposure and explicit configuration schema versioning were deliberately postponed beyond 1.0 to preserve release focus and avoid premature schema/version governance.
The canonical mutable/frozen naming model was finalized across configuration, policy, diagnostics, and staged validation-log types. Public API inputs were standardized around mapping-based overlays, while typed override bridge types remain internal CLI/API orchestration details.
Result: configuration became typed, layered, validated, normalized, and reproducible, while runtime behavior became explicit.
Pipeline and policy stabilization¶
Pipeline behavior was stabilized around deterministic preview/apply semantics.
Key outcomes included:
- explicit preview vs apply behavior;
- normalized write statuses;
- empty-file and empty-like classification;
- policy handling through
HeaderMutationModerather than boolean flags; - idempotent processing and deterministic summaries;
- centralized exit-code derivation from pipeline results;
- synthetic pipeline contexts for explicit missing literal inputs;
- and typed value objects replacing ambiguous tuple-shaped return contracts.
The user-facing policy/report contract was frozen.
Public API callers continue using stable string policy tokens for 1.0. A dedicated public enum surface remains a possible post-1.0 follow-up.
Key stabilized behaviors included:
--reportcontrols human per-file output scope only;header_mutation_modecontrolscheckinsertion/update intent;empty_insert_modeclassifies empty and empty-like files;- safety gates remain authoritative over all policy options.
Line-ending support was also audited and frozen. TopMark recognizes LF, CRLF, and CR as physical newline styles. Non-standard Unicode separators are treated as ordinary content rather than line endings.
Result: pipeline behavior became explicit, consistent, idempotent, and policy driven.
Machine output contracts¶
Machine-readable output was redesigned as a schema-driven system separated from CLI formatting.
The final machine-output model is:
- domain-scoped;
- JSON/NDJSON based;
- schema-driven with TypedDict contracts;
- separated from human presentation;
- and covered by focused contract tests.
Machine output now uses consistent envelope and metadata conventions across configuration, pipeline, registry, version, and probe command families.
The stabilization effort also froze several important naming and shape conventions:
- plural domain-specific JSON collection keys;
- singular NDJSON record kinds;
qualified_keyplusnamespace/local_keyidentity metadata;- and flattened domain-specific registry envelopes.
Important frozen decisions included:
- pipeline summaries use explicit
(outcome, reason, count)rows; config checkuses the explicitconfig_checkpayload/record kind;- registry JSON output uses flattened domain-specific envelopes;
detail_levelreflects projection depth where emitted;- machine-readable payloads emit canonical qualified file-type identifiers;
- JSON/NDJSON payloads do not imply process status, which remains the CLI exit code.
Probe machine output was added with per-path probe results and filtered explicit-input reporting.
Result: machine-readable formats became stable, documented, and suitable for downstream automation.
Human output contracts¶
Human output was consolidated into TEXT and Markdown.
TEXT output remains compact and console-oriented. It may use -v and -vv for progressive
disclosure, and --quiet only where suppressing output still leaves a meaningful status,
inspection, or mutation signal.
Markdown output is document-oriented and intentionally ignores TEXT-oriented verbosity, quiet, and styling controls.
Semantic styling was routed through StyleRole, Theme, TextStyler, and maybe_style(). The 1.0
decision was to keep yachalk because styling is confined to CLI presentation internals, while
deferring any Rich or rich-click migration until after 1.0 unless a concrete blocker appears.
Result: human output became consistent, composable, and decoupled from CLI command functions.
Documentation governance¶
Documentation was updated to match the final architecture and contract model.
Major outcomes included:
- canonical command-page structure across command and command-group documentation;
- aligned usage, configuration, architecture, API, machine-output, and README documentation;
- explicit public/internal boundary documentation;
- shared snippets for terminology, strict config loading, qualified/local file-type identifiers, and override boundaries;
- generated-site navigation and TOC-density improvements;
- a project-wide terminology glossary;
- terminology governance through reusable snippets and shared canonical wording;
- and calmer starter-template prose.
Documentation hygiene was added through tools/docs/check_docs_hygiene.py, exposed via nox and
Makefile targets. It validates snippet includes, docs-root-relative include paths, section
structure, heading conventions, accidental macOS resource files, and changelog heading shape.
Python code-prose hygiene was added through tools/docs/check_code_hygiene.py, covering comments,
docstrings, and prose-oriented string literals.
The stabilization effort also consolidated documentation conventions around:
- command-page structure;
- shared terminology governance;
- snippet usage boundaries;
- generated-reference expectations;
- changelog heading/section rules;
- and explicit implementation-boundary guidance for user-facing documentation.
Result: documentation became convention-driven, validated, and aligned with the frozen 1.0 behavior.
CI, release, and dependency model¶
The project migrated to a uv-first dependency model and replaced tox with nox.
The contributor workflow also migrated away from tox toward uv-backed nox sessions with metadata-driven Python-version management.
Release publication was redesigned around an artifact-based CI → release split:
- CI builds release artifacts on tag pushes;
- release artifacts and metadata are uploaded from CI;
- the privileged release workflow downloads, verifies, and publishes those artifacts;
- the release workflow no longer rebuilds from repository source.
Versioning moved to setuptools-scm, making Git tags the source of truth. Release validation now
checks SCM-derived artifact versions against the resolved release tag rather than comparing against
a static project version.
Install-smoke validation was added across Linux, macOS, and Windows, validating wheel and sdist installation from built artifacts in isolated environments.
This became one of the most important late-beta hardening steps because it exposed implicit runtime and packaging assumptions that were not visible inside local development environments or standard CI jobs.
CI was hardened through:
- link checks;
- documentation hygiene checks;
- SHA-pinned actions;
- explicit permissions;
- metadata-driven Python-version resolution;
- dynamic compatibility-matrix generation through
nox -s print_python_matrix; - canonical CI coverage reporting;
- and explicit uv cache ownership.
The nox -s print_python_matrix session centralizes Python-version metadata for CI. Compatibility
matrix jobs and canonical single-version jobs consume resolved metadata, while release publication
keeps an explicit release-tooling Python runtime and reports non-blocking drift warnings.
Result: release and installation validation became reproducible, automated, documented, and validated across real workflow runs.
Typing and API hardening¶
Late-beta stabilization tightened internal typing and ownership boundaries under Pyright strict mode.
Key outcomes included:
- replacing ambiguous tuple-shaped returns with frozen value objects or DTO-style models;
- tightening protocol/view contracts toward read-only semantics where mutation is not intended;
- documenting public API snapshot exports vs public-adjacent integration surfaces;
- preserving Python 3.10 compatibility;
- replacing temporary typed pytest decorator wrappers with direct pytest marker/decorator usage once typing support stabilized;
- and removing obsolete or redundant helper layers where they no longer contributed to the frozen architecture.
Examples of typed result-object cleanup included version/config preparation, TOML/config bridges, CLI input planning, runtime execution, pipeline execution, processor results, diagnostics, glob-rebasing, and file-type resolution paths.
Result: the implementation became clearer, better typed, and less dependent on positional ownership contracts.
Coverage and validation stabilization¶
Coverage-driven stabilization was used as a confidence-building signal, not as a percentage target.
Focused coverage expansion improved confidence around:
- outcomes;
- status/context behavior;
- merge utilities;
- file-type gating;
- enum mixins;
- diff rendering;
- validation/error paths;
- config, pipeline, registry, version, and probe machine contracts.
Canonical CI coverage reporting was added through the existing nox -s coverage session. CI now
publishes a GitHub Step Summary plus HTML, XML, and JSON coverage artifacts.
A deliberate decision during stabilization was to keep coverage informational rather than percentage-gated. The project treats coverage primarily as a confidence-building and blind-spot identification signal instead of a release-governance metric.
README coverage badge adoption remains deferred until the published signal proves stable, representative, and useful over time.
Result: coverage became a release-confidence signal without becoming a release gate.
Beta stabilization phase¶
The beta series shifted the project from architectural redesign toward contract validation, release-path rehearsal, ecosystem observation, and governance hardening.
The beta stabilization series validated:
- GitHub prerelease publication behavior;
- TestPyPI prerelease publication;
- artifact-based CI → release publication;
- SCM-derived versioning through
setuptools-scm; - install-smoke validation across Linux, macOS, and Windows;
- documentation-governance enforcement;
- changelog-governance enforcement;
- prose/code-hygiene tooling;
- CI metadata/bootstrap consistency;
- explicit uv cache ownership in GitHub workflows;
- and metadata-driven Python-version resolution through
nox -s print_python_matrix.
The beta series also served as the final release-path rehearsal phase before the 1.0 release candidates, allowing workflow, packaging, installation, and publication behavior to be validated repeatedly through real prerelease tags and published artifacts.
Result: by the end of the beta line, the remaining work had narrowed to RC validation and focused hardening rather than architectural change.
Contract freezes¶
By the end of the beta line, the following 1.0 contracts were frozen:
- registry and file-type identity semantics;
- CLI command applicability;
- CLI exit codes;
- STDIN handling;
- user-facing policy/report behavior;
- TOML/config/runtime layering;
- staged validation internals with flattened compatibility diagnostics at public boundaries;
- JSON/NDJSON machine-output schemas;
- TEXT and Markdown human-output behavior;
- artifact-based release workflow;
- uv/nox dependency tooling;
- documentation governance and hygiene checks.
The 1.0 release-candidate phase therefore served primarily as a contract-validation and compatibility-preservation phase rather than a feature-completion phase.
The validated 1.0.0rc1 release candidate confirmed these contracts across local validation,
CI/release workflows, TestPyPI publication, GitHub prerelease publication, and published-artifact
verification.
The final 1.0.0 release then promoted those validated contracts into the stable 1.x release line.
Deferred post-1.0 scope¶
Several items were explicitly deferred beyond 1.0 to preserve release focus:
- in-memory pipeline support and a future
InputSourceabstraction; - configuration schema versioning;
- richer staged validation exposure in CLI/API/machine output;
- registry query/filter commands;
- multi-line TopMark header fields;
- Rich or
rich-clickmigration; - broader workflow/release-infrastructure factoring;
- deeper documentation-structure validation;
- and README coverage badge publication.
These deferrals were deliberate scope decisions rather than incomplete 1.0 requirements, helping preserve stabilization focus during the final release cycle.
Lessons learned¶
The 1.0 stabilization effort reinforced several project-level lessons:
- explicit boundaries are easier to document, test, and preserve than implicit behavior;
- machine-readable output should be treated as a schema contract, not a CLI formatting variant;
- human output needs a separate presentation model from automation output;
- release workflows are easier to trust when artifact creation and privileged publication are separated and validated through published-artifact verification;
- documentation governance needs lightweight automated checks to remain sustainable;
- coverage is most useful as a confidence signal when it avoids percentage chasing;
- and explicit deferrals are healthier than forcing architectural ideas into a stabilization or release-candidate cycle under release pressure.