Skip to content

Machine-readable output

This document describes the stable machine-readable JSON and NDJSON formats emitted by TopMark for the 1.x line.

It is intended for integrators and tooling authors who consume TopMark programmatically.

Covered command groups:

This page is the canonical reference for TopMark's machine-readable output shapes. Usage guides for individual commands (for example, check, strip, and probe) provide task-oriented examples consistent with this schema.

Note

The canonical vocabulary used throughout the documentation is defined in Terminology and Canonical Vocabulary.

Note

Path representation

TopMark serializes machine-readable filesystem path fields with POSIX / separators on all platforms.

Path serialization is a presentation contract and is distinct from filesystem identity.

TopMark first determines the selected processing path for the filesystem target being processed and then serializes that processing path according to the machine-output contract.

This contract applies to:

  • header metadata path fields;
  • processing machine-output payloads;
  • probe machine-output payloads;
  • configuration machine-output payloads; and
  • TOML/config provenance payloads.

Examples:

real/file.py
./real/file.py
link-to-file.py

may refer to the same filesystem identity and therefore produce the same serialized processing path.

TopMark's machine-readable path fields remain path-based and are derived from the selected processing path for each processing target.

Filesystem identity policy is a separate concern from path serialization. TopMark may apply additional filesystem-identity rules when determining whether a processing target is eligible for processing. For example, selected hard-linked files are detected using device/inode identity and are reported as unsupported processing targets. Such checks do not alter the serialized path values emitted in machine-readable output.

Human-facing output follows display-path policy instead:

  • CLI and Markdown reports may use the host platform's native path representation;
  • STDIN-backed processing displays the logical --stdin-filename when available; and
  • unified diff file labels are human-facing display labels, not machine-readable path fields.

Synthetic configuration-source identifiers (for example built-in defaults) are serialized as stable labels rather than filesystem paths.

See also:

Output formats

TopMark exposes four stable --output-format values:

  • human-oriented formats (not machine-stable):
  • text: default human-oriented text.
  • markdown: human-oriented Markdown.
  • machine-readable formats (schema described in this document):
  • json: a single JSON document per invocation.
  • ndjson: a newline-delimited JSON stream.

The schemas below apply only to json and ndjson.

Notes:

  • Machine-readable formats never include ANSI color codes and are not affected by --color.
  • Machine-readable formats are independent of human-facing presentation controls.
  • TEXT-only flags such as -v / --verbose and -q / --quiet do not affect machine-readable output.
  • Markdown output is also independent from machine-readable formats and follows its own document-oriented contract.
  • Unified diff output from --diff is human-facing output. Its file labels follow display-path policy and are not part of the JSON/NDJSON machine-readable path contract.

Exit codes and machine-readable output

Machine-readable output (json, ndjson) is intentionally decoupled from CLI exit codes:

  • Exit codes are not embedded in JSON or NDJSON payloads.
  • Structured payloads represent results, diagnostics, and resolution state only.

Consumers must:

  • inspect the process exit code for success or failure semantics,
  • parse machine-readable output for detailed diagnostics and results.

When strict configuration validation stops a command before probing or processing begins, TopMark still emits valid machine-readable configuration diagnostics. In that case, JSON output may contain a configuration-diagnostics envelope without normal probes, results, or summary payloads, and NDJSON output may contain only the configuration-diagnostics prefix records.

This design ensures a clean separation between:

  • process status (exit code), and
  • structured data contract (machine-readable output).

Refer to Exit codes for the full contract.


Shared concepts

MetaPayload

All machine-readable outputs include a small metadata block, either:

  • as the top-level meta key in JSON documents, or
  • as the top-level meta key in every NDJSON record.

Shape:

{
  "meta": {
    "tool": "topmark",
    "version": "<package version>",
    "platform": "darwin",
    "detail_level": "brief" // or "long"
  }
}

Notes:

  • version reflects the resolved TopMark package version (normally PEP 440), derived from Git tags via setuptools-scm. Examples are illustrative only.
  • platform is a short runtime identifier (e.g., from sys.platform).
  • detail_level is machine-facing and distinguishes the default projection ("brief") from the expanded projection requested via --long ("long") when a command surface emits that field. Registry machine-readable output currently includes detail_level; other command families may omit it.

detail_level is distinct from TEXT verbosity (-v) and quiet mode (--quiet). It reflects an explicit machine-facing projection such as --long, not presentation detail.

Shared metadata keys are defined in topmark.core.machine.schemas.MachineMetaKey. Diagnostic-domain identifiers used in diagnostic payloads are defined in topmark.core.machine.schemas.MachineDomain.

Canonical keys are defined in topmark.core.machine.schemas.

For version-reporting commands, the machine-readable output metadata reflects the same runtime-resolved package version used by topmark version, which is sourced from generated package version metadata rather than a static config field.

NDJSON record contract

NDJSON output is a stream of JSON objects ("records"). Each record:

  • MUST include:
  • kind (string)
  • meta (MetaPayload)
  • MUST store its payload under a container key that matches kind.

Example:

{"kind":"config","meta":{...},"config":{...}}

Consumers should switch on the kind field rather than relying on ordering. Some command families emit a stable prefix, as documented below.

Shared record construction and envelope serialization helpers live under topmark.core.machine. Domain-specific payload builders and record kinds live in the corresponding *.machine packages.

Canonical kind strings are now owned by the schema module for the corresponding machine-output package rather than a single monolithic core namespace. Shared envelope keys remain in topmark.core.machine.schemas, while record kinds are defined in package-local schema modules such as:

Naming conventions

The machine-readable output naming audit for the stable 1.x line adopts the following conventions across domains:

  • Shared envelope and metadata keys are owned by topmark.core.machine.schemas.
  • JSON uses domain-specific aggregated keys (for example filetypes, processors, bindings, results, probes, config_layers) rather than a generic container such as items.
  • NDJSON uses singular record kinds (for example filetype, processor, binding, result, probe, summary) and stores each payload under a container key that matches kind.
  • Header processor identities and file type identities in machine-readable output always report canonical qualified keys such as topmark:pound and topmark:python when a resolved identity is available.
  • Decomposed identities use namespace + local_key.
  • Relationship references use *_key (for example file_type_key, processor_key).
  • detail_level is an extended metadata field rather than baseline metadata:
  • baseline metadata is tool, version, platform
  • detail_level is emitted only by command families whose machine-readable output exposes a brief vs long projection

These conventions are part of the stable 1.x machine-readable output contract.

File type identity fields

Machine-readable output uses the same canonical identity model as the runtime registry and resolution system.

When a file type identity is present, payloads expose:

  • qualified_key: canonical identifier, for example topmark:python;
  • namespace: producer namespace, for example topmark;
  • local_key: local identifier within the namespace, for example python.

Public inputs may use local identifiers such as python when unambiguous, but machine-readable output emits resolved canonical identities. Consumers should prefer qualified_key for durable comparisons and joins across payloads.

See Registry model for the full identity contract.

Filesystem path fields

Machine-readable filesystem path fields expose TopMark's selected processing paths.

Configuration discovery is evaluated earlier and is not represented by machine-readable filesystem path fields such as per-result path. Project-chain configuration discovery starts from the resolved discovery anchor before filesystem-identity evaluation selects processing paths. Separate configuration provenance payloads may serialize discovered configuration-source origins and resolved scope roots.

A processing path is selected after discovery, filtering, filesystem-identity normalization, and processing-path selection. It is not necessarily the original CLI argument, configuration entry, glob match, or symlink spelling supplied by the user.

TopMark currently identifies existing processing inputs by resolved processing target path. For example, if both of these paths refer to the same target:

real/source.py
links/source-link.py

machine-readable output may report only:

real/source.py

The emitted path is still serialized with POSIX / separators on all platforms, as documented in the path-representation contract above.

Hard-link policy is intentionally separate from this path serialization contract. If two or more selected processing paths are hard links to the same filesystem object, TopMark keeps one machine-output result per selected path, but reports each affected path as an unsupported, policy-blocked processing target instead of selecting a source, target, winner, or loser path.


Resolution diagnostics (probe)

The topmark probe command exposes stable file-type and processor resolution diagnostics. It is a diagnostic command, not a compliance or mutation command: it does not compute header changes, diffs, strip plans, or write plans.

Probe output reports canonical file type identities after identifier normalization and file-type filtering.

Probe output also reports processing paths after discovery, filesystem-identity evaluation, and processing-path selection. When a symlinked input reaches probing, the emitted path describes the resolved processing target rather than preserving the symlink spelling.

Probe machine-readable output is unaffected by TEXT verbosity or quiet mode. The JSON and NDJSON formats expose the same resolution evidence used by the human-facing probe renderers:

  • selected file type and selected processor
  • probe status and reason
  • all scored candidate file types
  • candidate match signals
  • explicit file inputs filtered during discovery before file-type probing
  • explicit missing inputs that could not produce a normal resolution probe

JSON schema

{
  "meta": { /* MetaPayload */ },
  "config": { /* RuntimeConfigPayload */ },
  "config_diagnostics": { /* ConfigDiagnosticsPayload */ },
  "probes": [
    { /* per-path probe payload */ }
  ]
}
  • meta: small metadata block, including tool name and TopMark version.
  • config: snapshot of the effective runtime configuration used for file filtering and resolution policy.
  • config_diagnostics: full diagnostics payload including counts and the list of config diagnostics.
  • probes: one resolution probe payload per probed path, including explicit file inputs filtered before file-type probing and explicit missing inputs that could not produce a normal resolution probe.

If strict configuration validation stops probe before resolution begins, JSON output contains the same meta and config_diagnostics structure but may omit the normal probes payload.

NDJSON schema

NDJSON output follows the same stable config prefix used by processing commands, then emits one probe record per probe result:

{"kind":"config","meta":{...},"config":{...}}
{"kind":"config_diagnostics","meta":{...},"config_diagnostics":{"diagnostic_counts":{...}}}
{"kind":"diagnostic","meta":{...},"diagnostic":{"domain":"config","level":"warning","message":"..."}}
{"kind":"probe","meta":{...},"probe":{ /* per-path probe payload */ }}

NDJSON rules for probe:

  • Every record includes kind and meta.
  • Payload container key matches kind.
  • The stream begins with:
  • config
  • config_diagnostics (counts-only)
  • zero or more diagnostic records (each with domain="config")
  • Then one probe record is emitted per probe result.

If strict configuration validation stops probe before resolution begins, the stream may end after the config_diagnostics and diagnostic records, before any probe records are emitted.

The JSON probes key and NDJSON probe kind are defined in topmark.pipeline.machine.schemas.PipelineKey and topmark.pipeline.machine.schemas.PipelineKind. The NDJSON stream is produced by:


Per-path probe payload

Each element of the JSON probes array and each NDJSON probe record contains a per-path resolution probe payload. Most probe payloads correspond to files that reached file-type probing, but explicit file inputs filtered during discovery are also represented as probe payloads. Explicit missing inputs are represented as probe_missing payloads when no normal resolution probe could be recorded.

High-level shape:

{
  "path": "README.md",
  "status": "resolved",
  "reason": "selected_highest_score",
  "selected_file_type": {
    "qualified_key": "topmark:markdown",
    "namespace": "topmark",
    "local_key": "markdown",
    "score": 54
  },
  "selected_processor": {
    "qualified_key": "topmark:html",
    "namespace": "topmark",
    "local_key": "html"
  },
  "candidates": [
    {
      "qualified_key": "topmark:markdown",
      "namespace": "topmark",
      "local_key": "markdown",
      "score": 54,
      "selected": true,
      "tie_break_rank": 1,
      "match": {
        "extension": true,
        "filename": false,
        "pattern": false,
        "content_probe_allowed": false,
        "content_match": false,
        "content_error": null
      }
    }
  ]
}

Filtered explicit input shape:

{
  "path": "__pycache__/example.cpython-312.pyc",
  "status": "filtered",
  "reason": "excluded_by_path_filter",
  "selected_file_type": null,
  "selected_processor": null,
  "candidates": []
}

Missing explicit input shape:

{
  "path": "topmark-does-not-exist",
  "status": "probe_missing",
  "reason": "no_resolution_probe_result",
  "selected_file_type": null,
  "selected_processor": null,
  "candidates": []
}

Filtered probe payloads are emitted only for explicit file inputs supplied to topmark probe (including paths loaded via --files-from). TopMark does not enumerate every recursively discovered file that was ignored by discovery filters.

Explicit directories that successfully expand to selected child files are treated as discovery sources and are not emitted as separate filtered probe payloads. Explicit missing inputs are emitted as probe_missing payloads rather than filtered payloads.

Filtered probe payloads may use one of these reasons:

  • excluded_by_path_filter - excluded by path-based include/exclude rules.
  • excluded_by_file_type_filter - excluded by file-type include/exclude rules after identifier normalization to canonical qualified keys.
  • excluded_by_discovery_filter - excluded before probing, but exact category was not identified.

Fields:

  • path: probed filesystem processing path, serialized with POSIX / separators. For normal probe payloads, this is the selected processing path after filesystem-identity evaluation. For filtered and missing explicit inputs, this remains the explicit path supplied to the command because no normal processing path was selected.
  • status: probe status, currently one of:
  • resolved - a file type and processor were selected.
  • unsupported - no file type candidate matched.
  • no_processor - a file type was selected, but no processor binding was available.
  • filtered - an explicitly requested file input was filtered during discovery before file-type probing.
  • probe_missing - an explicit input could not produce a normal resolution probe payload.
  • reason: machine-friendly explanation for the status and selection, for example:
  • selected_highest_score
  • selected_by_tie_break
  • no_candidates
  • selected_file_type_has_no_bound_processor
  • excluded_by_path_filter
  • excluded_by_file_type_filter
  • excluded_by_discovery_filter
  • no_resolution_probe_result
  • hard_link_duplicate
  • selected_file_type: selected canonical file type identity and score, or null when unresolved, unbound, filtered, or missing.
  • selected_processor: selected processor identity, or null when unresolved, unbound, filtered, or missing.
  • candidates: scored candidate file types in deterministic resolution order. Empty for filtered explicit inputs, missing explicit inputs, and unsupported paths with no candidates.

Candidate fields:

  • qualified_key, namespace, local_key: canonical file type identity fields. Consumers should prefer qualified_key for stable comparisons.
  • score: resolver score for this candidate. Higher scores are preferred.
  • selected: whether this candidate is the effective selected file type.
  • tie_break_rank: one-based deterministic rank after score and tie-break ordering.
  • match: probe-visible match signals:
  • extension
  • filename
  • pattern
  • content_probe_allowed
  • content_match
  • content_error

Note

Scores are exposed for explainability and ordering. Automation should primarily rely on status, selected, and canonical identities (qualified_key, namespace, local_key) rather than hard-coding exact numeric scores.

Filtered probe payloads have no candidate-level match object because file-type probing did not run. The reason identifies whether the path was excluded by path filters, file-type filters, or a generic discovery filter fallback.

Note:

  • Explicit missing literal inputs are represented as probe_missing probe payloads and still fail the CLI invocation with FILE_NOT_FOUND (66).
  • Synthetic filtered probe entries may still appear for explicitly requested files that were filtered during discovery, but exit-code precedence is resolved at the CLI layer.

Processing commands (check, strip)

Processing commands produce either detail output (per-file results) or summary output (bucket counts), depending on whether the CLI is in --summary mode.

Processing machine-readable output is unaffected by TEXT verbosity or quiet mode; those flags affect only human TEXT output.

JSON schema (detail mode)

Detail mode corresponds to summary_mode = false.

{
  "meta": { /* MetaPayload */ },
  "config": { /* RuntimeConfigPayload */ },
  "config_diagnostics": { /* ConfigDiagnosticsPayload */ },
  "results": [
    { /* per-file result payload */ }
  ]
}

If strict configuration validation stops a processing command before file discovery or pipeline execution begins, JSON output contains the same meta and config_diagnostics structure but may omit the normal results or summary payload.

JSON schema (summary mode)

Summary mode corresponds to summary_mode = true.

{
  "meta": { /* MetaPayload */ },
  "config": { /* RuntimeConfigPayload */ },
  "config_diagnostics": { /* ConfigDiagnosticsPayload */ },
  "summary": [
    { "outcome": "unchanged", "reason": "up-to-date", "count": 30 },
    { "outcome": "would insert", "reason": "header missing, changes found", "count": 1 }
  ]
}

Important characteristics:

  • The summary is not nested by outcome.
  • Each row has three stable fields:
  • outcome - pipeline outcome (e.g. inserted, replaced, unchanged).
  • reason - short lowercase bucket reason used for grouping.
  • count - number of files in that bucket.
  • Ordering is deterministic: outcomes follow the internal Outcome ordering and reasons are alphabetically sorted within each outcome.

NDJSON schema (detail and summary)

NDJSON output is a stream with a stable prefix and then either result records (detail) or summary records (summary).

Example stream:

{"kind": "config", "meta": { /* MetaPayload */ }, "config": { /* ConfigPayload */ }}

{"kind": "config_diagnostics",
 "meta": { /* MetaPayload */ },
 "config_diagnostics": { "diagnostic_counts": {"info": 0, "warning": 1, "error": 0} } }

{"kind": "diagnostic",
 "meta": { /* MetaPayload */ },
 "diagnostic": { "domain": "config", "level": "warning", "message": "..." } }

{"kind": "result",
 "meta": { /* MetaPayload */ },
 "result": { /* per-file result payload */ } }

In summary mode, per-file result records are replaced by one summary record per (outcome, reason) bucket:

{"kind":"summary","meta":{ /* MetaPayload */ },"summary":{"outcome":"unchanged","reason":"up-to-date","count":30}}
{"kind":"summary","meta":{ /* MetaPayload */ },"summary":{"outcome":"skipped","reason":"known file type, headers not supported","count":1}}

NDJSON rules for processing commands:

  • Every record includes kind and meta.
  • Payload container key matches kind.
  • The stream begins with:
  • config
  • config_diagnostics (counts-only)
  • zero or more diagnostic records (each with domain="config", using the shared diagnostic-domain value from topmark.core.machine.schemas.MachineDomain)
  • Then either:
  • detail mode: one result record per file
  • summary mode: one summary record per (outcome, reason) bucket

If strict configuration validation stops a processing command before file discovery or pipeline execution begins, the stream may end after the config_diagnostics and diagnostic records, before any result or summary records are emitted.

The NDJSON record stream is produced by:


Per-file result payload

Each element of the JSON results array (detail mode) and each NDJSON result record contains a per-file processing result payload.

The exact field set can evolve over time, but the payload is intended to be:

  • JSON-safe (no ANSI / terminal formatting),
  • stable for CI and tooling integration,
  • tolerant of additive changes.

The canonical builders and typing live under:

Per-file result payloads report the selected processing path. If a file is reached through a symlink, the emitted path describes the resolved processing target rather than the symlink spelling. This is the same path identity used by runtime pipeline processing and generated filesystem-related header metadata.

If two or more selected processing paths are hard links to the same filesystem object, processing machine output still contains one result per selected path. Each affected result reports status.fs.label = "hard-linked processing target" and is classified as a policy-blocked skip. TopMark does not choose a source, target, winner, or loser path for the hard-link group. Unrelated selected files continue to produce normal results. Probe machine output reports affected paths as status = "unsupported" with reason = "hard_link_duplicate".

At a high level, per-file results include:

  • identity:
  • path (selected processing path, serialized with POSIX / separators)
  • file_type (resolved canonical TopMark file type key, for example topmark:python)
  • pipeline execution:
  • executed step names
  • per-axis status objects (axis, name, label)
  • derived intent/outcome helpers:
  • change intent / feasibility booleans
  • strip/insert/update intent summaries
  • optional diagnostics (per-file):
  • list of diagnostics (when requested / enabled)
  • pre-computed diagnostic counts

Note

  • Diffs (--diff) and any ANSI coloring are human-only and are not included in machine payloads.
  • Diff file labels use human-facing display paths, including the logical --stdin-filename for STDIN-backed processing when available. They are not machine-readable path serialization fields.
  • Human presentation controls such as -v / --verbose and -q / --quiet are ignored by machine-readable output. Consumers should use JSON/NDJSON fields rather than relying on TEXT, Markdown, or unified diff rendering.

ConfigPayload

ConfigPayload is a JSON-safe representation of the effective runtime configuration snapshot, derived from FrozenConfig and produced by topmark.config.machine.payloads.build_config_payload.

High-level structure (keys may be extended over time):

  • fields: header fields and their effective values.
  • header: header-related configuration.
  • formatting: formatting-related configuration.
  • writer: persisted writer options and related settings (enums serialized to strings).
  • files: file resolution/filtering options (filesystem paths serialized with POSIX / separators; synthetic config source identifiers remain stable labels).
  • policy: global resolved policy flags (booleans).
  • policy_by_type: per-file-type resolved policy overrides.

File type identifiers in files.include_file_types, files.exclude_file_types, and policy_by_type are emitted after configuration normalization. Consumers should expect canonical qualified keys such as topmark:python rather than the exact local-or-qualified spelling supplied by a user.

The effective runtime configuration reflected in ConfigPayload is produced after project-chain discovery from the resolved discovery anchor, configuration-source identity normalization, precedence evaluation, and runtime overlays have completed.

ConfigPayload may include resolved configuration file paths in fields such as files.config_files. Those paths describe configuration sources participating in the effective configuration; they are separate from config_provenance.discovery_anchor, which is emitted only when layered provenance is requested. If multiple inputs resolve to the same configuration-source identity, TopMark keeps the highest-precedence occurrence and emits that source identity once.

Normalization rules:

  • Path → POSIX-style string using / separators
  • str-backed Enum / StrEnum → string value (.value)
  • other Enum values → enum member name (.name)
  • nested mappings/sequences → standard JSON objects/arrays

For the current exact fields, see:


ConfigDiagnosticsPayload

ConfigDiagnosticsPayload summarizes the flattened compatibility view derived from staged validation logs.

Diagnostics may include runtime applicability warnings for unknown, malformed, or ambiguous file type identifiers encountered during configuration sanitation.

These diagnostics may originate from staged config-loading validation logs for:

  • TOML-source diagnostics
  • merged-config diagnostics
  • runtime applicability diagnostics

For the stable 1.x line, the machine-readable contract for configuration-loading diagnostics is this flattened compatibility view. Stage-local validation structure remains internal and is not serialized directly.

JSON shape:

{
  "diagnostic_counts": { "info": 1, "warning": 2, "error": 0 },
  "diagnostics": [
    { "level": "warning", "message": "..." },
    { "level": "info", "message": "..." }
  ]
}

The individual diagnostic entry shape is intentionally fixed at {level, message} for the stable 1.x line.

Example JSON diagnostics payload:

{
  "diagnostic_counts": { "info": 0, "warning": 2, "error": 0 },
  "diagnostics": [
    { "level": "warning", "message": "Duplicate included file types found in config" },
    { "level": "warning", "message": "Unknown included file types specified" }
  ]
}

Note

In NDJSON, config_diagnostics is counts-only and each individual config diagnostic is emitted as a separate diagnostic record with domain="config" (one record per diagnostic).

Example NDJSON diagnostics records:

{"kind":"config_diagnostics","meta":{...},"config_diagnostics":{"diagnostic_counts":{"info":0,"warning":2,"error":0}}}
{"kind":"diagnostic","meta":{...},"diagnostic":{"domain":"config","level":"warning","message":"Duplicate included file types found in config"}}
{"kind":"diagnostic","meta":{...},"diagnostic":{"domain":"config","level":"warning","message":"Unknown included file types specified"}}

See:


Configuration snapshot commands (config dump, config defaults, config init)

These commands produce a runtime configuration snapshot without running the processing pipeline.

Notes:

  • config dump emits the resolved runtime configuration snapshot after discovery and merge.
  • config dump --show-layers additionally emits a machine-readable config_provenance payload before the final flattened configuration snapshot.
  • config defaults emits the built-in default runtime configuration snapshot.
  • config init emits the same built-in default runtime configuration snapshot in machine-readable formats, even though its human-facing output is the bundled example TopMark TOML resource with comments.

Machine-readable output for these commands is unaffected by TEXT verbosity or quiet mode.

JSON shape for config dump, config defaults, config init

{
  "meta": { /* MetaPayload */ },
  "config": { /* ConfigPayload */ }
}

When config dump is invoked with --show-layers, the JSON envelope becomes:

{
  "meta": { /* MetaPayload */ },
  "config_provenance": {
    "discovery_anchor": "/repo",
    "config_layers": [
      {
        "origin": "<defaults>",
        "kind": "default",
        "precedence": 0,
        "toml": {
          "config": { "strict": false },
          "writer": { "strategy": "atomic" }
        }
      }
    ]
  },
  "config": { /* ConfigPayload */ }
}

config_provenance is an inspection-oriented payload. It contains:

  • discovery_anchor - resolved project/local discovery anchor used to find discovered config sources, when available
  • config_layers - ordered provenance layers

Each layer contains:

  • origin - provenance origin label
  • kind - resolved config layer kind
  • precedence - numeric layer precedence
  • scope_root - optional scope root for discovered layers
  • toml - the source-local TopMark TOML fragment contributed by that layer

NDJSON shape for config dump, config defaults, config init

Default mode emits a single record:

{"kind": "config", "meta": { /* MetaPayload */ }, "config": { /* ConfigPayload */ }}

When config dump is invoked with --show-layers, NDJSON emits two records in order:

{"kind": "config_provenance", "meta": { /* MetaPayload */ }, "config_provenance": { /* TomlProvenancePayload */ }}
{"kind": "config", "meta": { /* MetaPayload */ }, "config": { /* ConfigPayload */ }}

TomlProvenancePayload

TomlProvenancePayload is a machine-readable layered provenance export used by topmark config dump --show-layers.

JSON shape:

{
  "discovery_anchor": "/repo",
  "config_layers": [
    {
      "origin": "<defaults>",
      "kind": "default",
      "precedence": 0,
      "toml": {
        "config": { "strict": false },
        "header": { "fields": ["file", "file_relpath"] },
        "writer": { "strategy": "atomic" }
      }
    },
    {
      "origin": "/repo/pyproject.toml",
      "kind": "discovered",
      "precedence": 1,
      "scope_root": "/repo",
      "toml": {
        "fields": { "project": "TopMark" },
        "writer": { "strategy": "atomic" }
      }
    }
  ]
}

The payload is inspection-oriented rather than a loadable topmark.toml document. It mirrors the human-facing layered TOML export by preserving ordered layers and the corresponding source-local TopMark TOML fragments.

Real filesystem discovery_anchor, origin, and scope_root values use POSIX / separators on all platforms. Synthetic layer origins such as built-in defaults are stable labels rather than filesystem paths.

File-backed configuration provenance uses configuration-source identity based on the resolved configuration-file target. If a configuration file is loaded through a symlink, origin and scope_root describe the resolved configuration target and its scope rather than the symlink spelling.

Workspace-root discovery is evaluated earlier. Project-chain discovery determines which configuration files participate in layered configuration construction before provenance identities are assigned. discovery_anchor records the resolved project/local starting directory for that discovery. It is not a configuration source, scope root, processing target, or filesystem identity. Provenance payloads then serialize discovered configuration-source origins and, when available, resolved scope roots for those layers.

The outer config_layers container key belongs to the config machine-readable output domain, while the inner provenance-layer fragment keys (origin, kind, precedence, toml, scope_root) are owned by topmark.toml.machine.schemas.TomlKey.


topmark config check

This command validates configuration and emits the resolved runtime configuration snapshot, configuration diagnostics, and a config_check status payload.

JSON shape for config check

{
  "meta": { /* MetaPayload */ },
  "config": { /* RuntimeConfigPayload */ },
  "config_diagnostics": { /* ConfigDiagnosticsPayload */ },
  "config_check": {
    "ok": true,
    "strict": false,
    "diagnostic_counts": { "info": 0, "warning": 1, "error": 0 },
    "config_files": ["..."]
  }
}
  • config: resolved runtime configuration snapshot.
  • config_diagnostics: full diagnostics payload, including counts and the list of individual config diagnostics.
  • config_check: command-status payload containing:
  • ok - whether validation succeeded
  • strict - whether strict config-checking mode was enabled
  • diagnostic_counts - counts by diagnostic level
  • config_files - config files that contributed to the resolved config, serialized with POSIX / separators for real filesystem paths and stable labels for synthetic sources

The strict field reflects the effective validation strictness used for the run. It is derived from TOML source configuration ([config].strict) and may be overridden by CLI or API inputs. This strictness is evaluated across staged config-loading validation, while config_diagnostics remains the flattened compatibility view exposed in machine-readable output.

For the stable 1.x line, this is the explicit contract decision: staged config-loading validation remains internal, and machine-readable output serializes only the flattened compatibility view.

Machine-readable output for config check is unaffected by TEXT verbosity or quiet mode.

NDJSON shape for config check

Stream prefix:

{"kind":"config","meta":{...},"config":{...}}
{"kind":"config_diagnostics","meta":{...},"config_diagnostics":{"diagnostic_counts":{...}}}
{"kind":"diagnostic","meta":{...},"diagnostic":{"domain":"config","level":"warning","message":"..."}}
{"kind":"config_check","meta":{...},"config_check":{...}}

The NDJSON stream follows the same stable prefix pattern used by processing commands:

  1. config
  2. config_diagnostics (counts-only)
  3. zero or more diagnostic records
  4. one final config_check record

Notes:

  • NDJSON follows the same counts-only + one diagnostic per line model for the flattened compatibility view.

(See topmark.config.machine.* for canonical builders/serializers.)

Config-specific JSON payload keys and NDJSON kinds are defined in topmark.config.machine.schemas.ConfigKey and topmark.config.machine.schemas.ConfigKind. Shared config diagnostic entry/count keys are defined in topmark.diagnostic.machine.schemas.


topmark version

JSON shape for version

{
  "meta": { /* MetaPayload */ },
  "version_info": {
    "version": "<package version>",
    "version_format": "pep440"
  }
}

NDJSON shape for version

{"kind":"version","meta":{ /* MetaPayload */ },"version_info":{ "version":"<package version>", "version_format":"pep440" }}

The NDJSON version record kind is defined in topmark.version.machine.schemas.VersionKind, while JSON payload keys such as version_info are defined in topmark.version.machine.schemas.VersionKey.

The version reported in machine-readable output is derived from the installed package metadata / generated version module, not from a manually maintained static field in pyproject.toml.

Machine-readable output for version is unaffected by TEXT verbosity or quiet mode.

Notes:

  • version_format may be "pep440" or "semver" depending on --semver.
  • PEP 440 output is the canonical packaging version form used by Python packaging tools.
  • If SemVer conversion is requested and fails, TopMark falls back to PEP 440 output.
  • The machine envelope kind for this command is version, while the JSON payload container key is version_info.
  • For development builds between release tags, the reported version may include SCM-derived dev/local segments such as commit identifiers.

Registry commands

Registry machine-readable output uses --long to select brief vs detailed projections. Registry commands do not support --quiet, and TEXT verbosity does not affect machine-readable output.

topmark registry filetypes

JSON envelope:

{
  "meta": { /* MetaPayload */ },
  "filetypes": [ /* FileTypeEntry ... */ ]
}

qualified_key is the canonical file type identity. namespace and local_key are provided for inspection, grouping, and display.

Detailed filenames values are canonical filename rules, not filesystem paths. Exact-basename rules are emitted unchanged, while tail-subpath rules are emitted with POSIX-style / separators across platforms.

Brief entry (default):

{
  "local_key": "python",
  "namespace": "topmark",
  "qualified_key": "topmark:python",
  "description": "Python source file"
}

Detailed entry (--long):

{
  "local_key": "python",
  "namespace": "topmark",
  "qualified_key": "topmark:python",
  "description": "Python source file",
  "bound": true,
  "extensions": [".py"],
  "filenames": [],
  "patterns": [],
  "skip_processing": false,
  "has_content_matcher": false,
  "has_insert_checker": false,
  "policy": {
    "supports_shebang": true,
    "encoding_line_regex": null,
    "pre_header_blank_after_block": 1,
    "ensure_blank_after_header": true,
    "blank_collapse_mode": "strict",
    "blank_collapse_extra": ""
  }
}

NDJSON emits one record per file type:

{"kind":"filetype","meta":{...},"filetype":{ /* FileTypeEntry */ }}

Canonical schemas/builders live in [topmark.registry.machine.*][topmark.registry.machine].

The corresponding NDJSON record kind is owned by topmark.registry.machine.schemas.RegistryKind.

topmark registry processors

JSON envelope:

{
  "meta": { /* MetaPayload */ },
  "processors": [ /* ProcessorEntry ... */ ]
}

Processor entry (brief):

{
  "local_key": "python",
  "namespace": "topmark",
  "qualified_key": "topmark:python",
  "description": "Python-style line comment processor"
}

Processor entry (detailed, --long):

{
  "local_key": "python",
  "namespace": "topmark",
  "qualified_key": "topmark:python",
  "description": "Python-style line comment processor",
  "bound": true,
  "line_indent": "",
  "line_prefix": "# ",
  "line_suffix": "",
  "block_prefix": "",
  "block_suffix": ""
}

NDJSON emits one record per processor:

{"kind":"processor","meta":{...},"processor":{ /* ProcessorEntry */ }}

Canonical schemas/builders live in [topmark.registry.machine.*][topmark.registry.machine].

The corresponding NDJSON record kind is owned by topmark.registry.machine.schemas.RegistryKind.

topmark registry bindings

JSON envelope:

{
  "meta": { /* MetaPayload */ },
  "bindings": [ /* BindingEntry ... */ ],
  "unbound_filetypes": [ /* FileTypeRef ... */ ],
  "unused_processors": [ /* ProcessorRef ... */ ]
}

Binding entry (brief):

{
  "file_type_key": "topmark:python",
  "processor_key": "topmark:python"
}

Binding references use canonical qualified keys. file_type_key references a file type qualified_key, and processor_key references a processor qualified_key.

Binding entry (detailed, --long):

{
  "file_type_key": "topmark:python",
  "file_type_local_key": "python",
  "file_type_namespace": "topmark",
  "processor_key": "topmark:python",
  "processor_local_key": "python",
  "processor_namespace": "topmark",
  "file_type_description": "Python source file",
  "processor_description": "Python-style line comment processor"
}

Auxiliary lists:

  • unbound_filetypes contains file types that currently have no effective processor binding.
  • brief mode: qualified file type keys as strings
  • long mode: expanded FileTypeRefEntry objects
  • unused_processors contains registered processors that do not currently participate in any effective binding.
  • brief mode: qualified processor keys as strings
  • long mode: expanded processor reference objects containing identity and description fields

NDJSON emits:

{"kind":"binding","meta":{...},"binding":{ /* BindingEntry */ }}
{"kind":"unbound_filetype","meta":{...},"unbound_filetype":"topmark:some_unbound_filetype"}
{"kind":"unused_processor","meta":{...},"unused_processor":"topmark:some_unused_processor"}

In brief mode, unbound_filetype and unused_processor NDJSON records carry qualified-key strings as their payloads. In --long mode, those same record kinds carry expanded reference objects.

Canonical schemas/builders live in topmark.registry.machine.*.

The corresponding JSON payload keys and NDJSON record kinds are owned by topmark.registry.machine.schemas.RegistryKind.


Backwards compatibility and evolution

TopMark's machine-readable output schema is part of its stable integration surface. For the stable 1.x line, documented JSON and NDJSON shapes are treated as machine-readable compatibility contracts.

Consumers should:

  • Treat machine-readable output as the authoritative contract for programmatic use; do not parse TEXT or Markdown output in automation.
  • Rely on kind for NDJSON.
  • Treat unknown fields as optional/ignorable.
  • Prefer parsing and schema-tolerant logic over strict string matching.
  • Assume additive fields may appear over time within the 1.x compatibility model.
  • Prefer canonical identity fields such as qualified_key, file_type_key, and processor_key over display-oriented names.
  • Treat filesystem path, discovery_anchor, origin, and scope_root fields as serialized processing/provenance paths, not as lossless echoes of the original invocation spelling.

Consumers should not infer the original workspace-root discovery-anchor spelling from machine-readable payloads. The stable 1.x contract serializes the resolved discovery anchor in config_provenance.discovery_anchor when provenance is requested, plus discovered configuration sources and scope roots where applicable.

Breaking machine-readable output changes should be signaled through Conventional Commits using the ! marker and documented in the changelog.