`topmark check`¶

Purpose: Verify TopMark headers and optionally insert or update them with --apply.

The check command verifies the presence and correctness of TopMark headers in targeted files. It does not modify files (dry-run) but reports which files would need updates. In this mode summaries end with - previewed. When run with --apply, files are actually modified and summaries end with - inserted, - replaced, or other terminal statuses.

Note

The canonical vocabulary used throughout the documentation is defined in Terminology and Canonical Vocabulary.

Note

Path representation

TopMark serializes machine-readable filesystem path fields with POSIX / separators on all platforms.

Path serialization is a presentation contract and is distinct from filesystem identity.

TopMark first determines the selected processing path for the filesystem target being processed and then serializes that processing path according to the machine-output contract.

This contract applies to:

header metadata path fields;
processing machine-output payloads;
probe machine-output payloads;
configuration machine-output payloads; and
TOML/config provenance payloads.

Examples:

real/file.py
./real/file.py
link-to-file.py

may refer to the same filesystem identity and therefore produce the same serialized processing path.

TopMark's machine-readable path fields remain path-based and are derived from the selected processing path for each processing target.

Filesystem identity policy is a separate concern from path serialization. TopMark may apply additional filesystem-identity rules when determining whether a processing target is eligible for processing. For example, selected hard-linked files are detected using device/inode identity and are reported as unsupported processing targets. Such checks do not alter the serialized path values emitted in machine-readable output.

Human-facing output follows display-path policy instead:

CLI and Markdown reports may use the host platform's native path representation;
STDIN-backed processing displays the logical --stdin-filename when available; and
unified diff file labels are human-facing display labels, not machine-readable path fields.

Synthetic configuration-source identifiers (for example built-in defaults) are serialized as stable labels rather than filesystem paths.

Quick start¶

# Dry-run: show which files would get a TopMark header or be updated
topmark check src/

# Apply in place
topmark check --apply src/

# Show unified diffs (human output)
topmark check --diff src/

# Summary-only view (CI-friendly)
topmark check --summary src/

# Suppress TEXT rendering and rely on the exit code
topmark check --quiet src/

# Render document-oriented Markdown output
topmark check --output-format markdown src/

# Treat staged configuration-loading validation warnings as errors for this run
topmark check --strict src/

# Read targets from stdin (one path per line) and generate unified diff output
git ls-files | topmark check --files-from - --diff

Input applicability¶

Dry-run by default; exit code WOULD_CHANGE (3) when changes would occur.
Preserves the file's original newline style (LF/CRLF/CR).
Preserves a leading UTF-8 BOM if present.
Places headers according to file-type policy (shebang and PEP 263 in Python; XML declaration/DOCTYPE in XML/HTML; no insertion inside Markdown fenced code).
Idempotent: re-running on already-correct files results in no changes.

STDIN modes¶

check supports both list STDIN mode (--files-from -, --include-from -, or --exclude-from -) and content STDIN mode (- plus --stdin-filename NAME). These modes are mutually exclusive.

With --apply in content mode, transformed content is written to STDOUT and diagnostics are routed to STDERR.

See shared input modes for the full STDIN contract, including why TopMark does not provide a --stdin option flag.

Configuration and validation¶

check supports --strict / --no-strict to override the effective strict value for the run.

Before any file processing begins, TopMark performs whole-source TOML schema validation during configuration loading. TOML-source diagnostics (including missing-section INFO diagnostics) are evaluated together with merged-config and runtime applicability diagnostics during staged configuration-loading validation for the run.

Note

[config].strict is a TOML-source-local strictness preference controlling staged configuration-loading validation for the current TOML source.

Effective strictness is evaluated across:

TOML-source diagnostics;
merged-config diagnostics;
runtime applicability diagnostics.

When strict validation fails, TopMark exits with CONFIG_ERROR. The diagnostics that triggered the failure remain visible in human-readable and machine-readable output formats.

strict is resolved during TOML loading and does not become a layered configuration field.

TopMark resolves configuration from defaults, user config, the project chain discovered from the resolved discovery anchor, explicit --config files, and CLI overrides before staged validation produces the effective runtime configuration. For path-processing commands such as check, the discovery anchor is derived from the first selected input path when one is available, or from the current working directory otherwise.

Configuration discovery is evaluated before runtime filesystem-identity evaluation selects processing paths. Symlinked discovery anchors therefore affect which project configuration files are found before selected processing paths, header metadata, or machine-readable result.path fields are produced. See Configuration discovery, precedence, and policy for the full configuration-loading and validation contract.

Filtering and file discovery¶

TopMark determines which files to process using a combination of path-based filters and file-type filters.

Path arguments, include/exclude patterns, --files-from, and file-type filters follow the shared TopMark filtering pipeline. Positional paths and relative patterns are resolved from the current working directory; path-based filters run before file-type filters, and exclude rules take precedence. See Filtering for the full path discovery contract.

During discovery, TopMark performs filesystem-identity evaluation and selects processing paths. If multiple path spellings resolve to the same filesystem target (for example a symlink and its target), check processes the resolved target once. Downstream filtering, probing, header generation, and machine-readable output operate on the selected processing path rather than the original spelling. Hard-link policy is evaluated as a processing-target eligibility check: if multiple selected processing paths are hard links to the same filesystem object, each affected path is reported as an unsupported, policy-blocked processing target.

This runtime discovery stage is separate from configuration discovery. Project-chain configuration files have already been selected from the resolved discovery anchor before check evaluates file filters and processing-target identity.

File type filters¶

--include-file-types / -t Restrict processing to the given file type identifiers. May be repeated and/or provided as a comma-separated list.
--exclude-file-types / -T Exclude the given file type identifiers. May be repeated and/or provided as a comma-separated list.

Exclude rules take precedence over include rules.

TopMark accepts file type identifiers in local form, such as python, or qualified form, such as topmark:python.

Local identifiers are accepted only when unambiguous. Internally, TopMark normalizes identifiers to canonical qualified file type identities before filtering, runtime resolution, policy evaluation, diagnostics, and registry lookup.

See file-type filtering for the full identifier contract.

Examples:

topmark check --include-file-types python src/
topmark check --include-file-types topmark:python src/
topmark check --exclude-file-types topmark:markdown docs/

Path-based filters¶

--include, --exclude Include or exclude glob patterns.
--include-from, --exclude-from Load patterns from files (one per line).
--files-from Provide an explicit list of files to process.

Notes:

Positional arguments are parsed by Click and resolved relative to the current working directory (CWD).
Unknown option-like tokens before the standard -- delimiter are parser errors. Use -- before literal path names that begin with a dash, for example topmark check -- --generated.py.
Patterns in --include, --exclude, and the files passed to --include-from / --exclude-from are also resolved relative to CWD. Absolute patterns are not supported.
Path-based filters are evaluated before file-type filters.
Existing filesystem inputs are normalized to selected processing paths before runtime processing.
Symlink spellings are not preserved for runtime identity, generated filesystem-related header metadata, or machine-readable result.path fields.
Hard-linked selected paths are handled as processing-target eligibility failures. Each affected path is reported independently and blocked from processing; TopMark does not select a preferred source, target, winner, or loser path.
Exclude rules win over include rules when both match a path.
File-type filters are applied after path-based include/exclude filtering.
File-type filters are normalized to canonical qualified file type identities before filtering, runtime resolution, policy evaluation, diagnostics, and registry lookup.
Explicit missing literal paths (for example fubar.py) are reported as FILE_NOT_FOUND (66).
Unmatched glob patterns (for example missing/**/*.py) are treated as soft discovery diagnostics and do not fail check.

--report controls the scope of human per-file TEXT rendering only. It does not affect processing, mutation behavior, summaries, machine-readable output, or exit-code selection.

Values:

actionable: show files that would change, changed, failed, or otherwise require attention; hide unsupported entries from the per-file listing while summaries may still count them.
noncompliant: show actionable entries plus unsupported entries.
all: show every processed result, including unchanged/compliant entries.

Example¶

# Use include/exclude files with relative patterns
printf "*.py\n" > inc.txt
printf "tests/*\n# ignored\n" > exc.txt

topmark check --include-from inc.txt --exclude-from exc.txt --diff

Command-specific policy options¶

The check command supports policy overrides that control how headers are inserted or updated.

Empty file behavior¶

--allow-header-in-empty-files / --no-allow-header-in-empty-files
--empty-insert-mode

These options control how check classifies empty files and whether headers may be inserted.

--empty-insert-mode defines which empty or empty-like files are eligible for insertion:

bytes_empty: only true 0-byte files
logical_empty: true 0-byte files plus logically empty placeholders
whitespace_empty: any decoded content containing only whitespace or newlines

This policy affects dry-run reporting, --apply behavior, API result views, and semantic runtime outcome bucketing.

This classification is evaluated together with --allow-header-in-empty-files:

when disabled (default), empty-like files are treated as unchanged and compliant
when enabled, eligible empty-like files may receive headers, subject to safety gates

--render-empty-header-when-no-fields is separate and controls whether an otherwise empty header may be rendered when no fields are configured.

Safety gates still take precedence. Unreadable files, unsupported files, malformed headers, blocked filesystem states, and other non-mutable conditions are not made mutable by these options.

Formatting and safety¶

--allow-reflow / --no-allow-reflow
--render-empty-header-when-no-fields / --no-render-empty-header-when-no-fields

These options influence rendering behavior and idempotence.

Shared policy¶

--allow-content-probe / --no-allow-content-probe

Controls whether file-type detection may inspect file contents when needed.

Behavior details¶

Placement rules (processor-aware):
Pound-style processors (for example Python, Shell, Ruby, Makefile): after shebang (and optional encoding line), else at top; keep exactly one blank around the block as per policy.
Slash-style processors (for example C, C++, TypeScript): at top with consistent spacing.
XML/HTML processors: after XML declaration and DOCTYPE; maintain a single intentional blank; never break the declaration.
Markdown processor: uses HTML comments for the header; fenced code blocks are ignored for detection.
Newline/BOM preservation: preserved across all paths (insert/replace). Reader normalizes in memory; updater reattaches BOM and keeps line endings.
Header metadata path fields: generated from the selected processing target. If a file is reached through a symlink, file_relpath, file_abspath, relpath, and abspath describe the resolved target TopMark reads and writes rather than the symlink spelling.
Hard-link safety: if multiple selected paths refer to the same filesystem object through hard links, check blocks every affected path. No header is inserted or updated for those paths, and no source, target, winner, or loser path is selected.
Idempotency: running topmark check again on a file that already has a correct header produces no diff and exit code 0 (unless other files would change).

Output behavior¶

Output format, TEXT verbosity, quiet mode, color output, and shared exit-code behavior are documented in shared options and exit codes.

Shared output controls¶

TEXT verbosity is separate from internal logging:

-v, --verbose increases TEXT output detail for check, such as per-line diagnostics and additional hints.
-q, --quiet suppresses TEXT rendering while preserving the command's exit status.
Markdown output is document-oriented and renders diagnostics and hints when present without requiring -v.
Machine-readable JSON and NDJSON output ignore TEXT-oriented verbosity and quiet controls.

Notes:

Summary mode aggregates outcomes and suppresses per-file guidance lines.
In TEXT rendering, per-line diagnostics are shown with -v and above.
Primary/headline hint selection is presentation-level guidance and is not part of the stable CLI contract; rely on exit codes and machine-readable output for automation.
The --diff option is human-readable only and rejected for JSON or NDJSON output. Machine-readable result payloads may expose reduced detail fields, but normal CLI JSON and NDJSON output does not render unified diff blocks.

Machine-readable output¶

Use --output-format json or --output-format ndjson to emit output suitable for tooling:

JSON: a single machine-readable JSON document containing meta, the effective runtime configuration snapshot, config_diagnostics, and then either results (detail mode) or summary (summary mode).
NDJSON: one machine-readable NDJSON record per line. Every record includes kind and meta, and the payload is stored under a container key that matches kind.

For the canonical schema, stable kind values, and shared conventions, see:

Note

Verbosity (-v / --verbose) affects only TEXT rendering.
Quiet mode (-q / --quiet) suppresses TEXT rendering for commands that support it.
Markdown and machine-readable output are not affected by TEXT verbosity controls.

Machine-readable output emits selected processing paths with POSIX / separators and resolved file type identities using canonical qualified identity strings when available. If a checked file is reached through a symlink, per-file result.path describes the resolved processing target rather than the symlink spelling. If selected paths are hard links to the same filesystem object, check emits one result per selected path and reports each affected path as a policy-blocked unsupported processing target. Configuration payloads also emit normalized file type filters and policy_by_type keys.

Notes:

The --diff option is human-readable only and rejected for JSON or NDJSON output. Machine-readable result payloads may expose reduced detail fields, but normal CLI JSON and NDJSON output does not render unified diff blocks.
Summary mode aggregates outcomes and suppresses per-file guidance lines.
The config payload in JSON and NDJSON is the resolved runtime configuration snapshot after per-source TOML validation, layered configuration merge, staged configuration-loading validation, and CLI override application.
Per-file result.path values are selected processing paths serialized with POSIX / separators on all platforms. This path serialization contract applies to processing result payloads; human TEXT output remains display-oriented.

JSON schema (detail mode)¶

When --summary is not set, topmark check emits a single JSON object:

{
  "meta": { /* MetaPayload */ },
  "config": { /* RuntimeConfigPayload */ },
  "config_diagnostics": { /* ConfigDiagnosticsPayload */ },
  "results": [
    { /* per-file result payload */ }
  ]
}

The per-file result payload mirrors strip but reflects the check intent (e.g. outcome.check.* fields instead of outcome.strip.*).

JSON schema (summary mode)¶

In summary mode (--summary), results is omitted and replaced by a flat summary list of rows:

{
  "meta": { /* MetaPayload */ },
  "config": { /* RuntimeConfigPayload */ },
  "config_diagnostics": { /* ConfigDiagnosticsPayload */ },
  "summary": [
    { "outcome": "unchanged", "reason": "up-to-date", "count": 30 },
    { "outcome": "would insert", "reason": "header missing, changes found", "count": 1 }
  ]
}

NDJSON schema (detail vs summary)¶

NDJSON is a stream with a stable prefix followed by either per-file result records (detail mode) or per-bucket summary records (summary mode):

Prefix records:
kind="config" (effective runtime configuration snapshot)
kind="config_diagnostics" (counts-only)
zero or more kind="diagnostic" records (each with domain="config"; these may originate from TOML-source, merged-config, or runtime applicability diagnostics)
Then:
detail mode (no --summary): one kind="result" record per file
summary mode (--summary): one kind="summary" record per (outcome, reason) bucket

Example (summary mode):

{"kind":"config","meta":{...},"config":{...}}
{"kind":"config_diagnostics","meta":{...},"config_diagnostics":{"diagnostic_counts":{"info":0,"warning":0,"error":0}}}
{"kind":"summary","meta":{...},"summary":{"outcome":"would insert","reason":"header missing, changes found","count":1}}

Command-specific options¶

Option	Description
`--apply`	Write changes to files (off by default).
`--diff`	Show unified diffs (human output only).
`--summary`	Show outcome counts instead of per-file details.
`-q`, `--quiet`	Suppress TEXT rendering while preserving the command's exit status.
`--files-from`	Read newline-delimited paths from file (use '-' for STDIN).
`-` (PATH)	Read one virtual file from STDIN content (requires `--stdin-filename`).
`--include`	Add paths by glob (can be used multiple times).
`--include-from`	File of patterns to include (one per line, `#` comments allowed).
`--exclude`	Exclude paths by glob (can be used multiple times).
`--exclude-from`	File of patterns to exclude.
`--include-file-types` / `-t`	Restrict to local or qualified TopMark file type identifiers.
`--exclude-file-types` / `-T`	Exclude local or qualified TopMark file type identifiers.
`--report`	Control reporting scope: actionable, noncompliant, or all.
`--header-mutation-mode`	Check-only policy override: `all`, `add-only`, or `update-only`.
`--empty-insert-mode`	Check-only policy override controlling empty-file classification.
`--strict` / `--no-strict`	Override effective configuration-loading validation strictness for this run.
`--stdin-filename`	Assumed filename when PATH is '-' (content from STDIN).

Run topmark check -h for the full list of options and help text.

Exit codes¶

topmark check uses exit code WOULD_CHANGE (3) as a stable dry-run signal when changes would be needed. Successful clean runs and successful --apply runs exit with SUCCESS (0).

Common check exit codes:

Scenario	Exit code
Clean run / successful apply	`SUCCESS (0)`
Dry-run would add or update	`WOULD_CHANGE (3)`
Missing explicit input path	`FILE_NOT_FOUND (66)`
Write/apply failure	`IO_ERROR (74)`
Permission failure	`PERMISSION_DENIED (77)`
Configuration error	`CONFIG_ERROR (78)`
Invalid CLI usage	`USAGE_ERROR (64)`

Notes:

Click parser-level usage errors (for example, unknown commands, unknown options, or invalid option values) may exit with code 2 before command logic runs.
Explicit missing literal paths are hard input errors and produce FILE_NOT_FOUND (66).
Unmatched glob patterns are soft discovery diagnostics and do not fail check.
In mixed-result runs, hard input and filesystem errors take precedence over WOULD_CHANGE (3).

See Exit codes for the complete CLI-wide exit-code contract.

Typical workflows¶

1) Add headers to a project¶

# Start with a dry-run to see impact
topmark check src/
# Then apply
topmark check --apply src/

2) Review a change set¶

git ls-files -m -o --exclude-standard | topmark check --files-from - --diff

3) CI: summarize and fail when changes are needed¶

# Print summary only. Exit 3 signals "would change" to fail the job.
topmark check --summary

4) Run with strict config checking¶

# Fail when staged configuration-loading validation warnings are present
# (for example TOML-source, merged-config, or runtime applicability warnings)
topmark check --strict src/

Pre-commit integration¶

topmark check is the command used by the non-destructive topmark-check pre-commit hook.

The hook runs topmark check against files selected by pre-commit and follows the same resolution, filtering, policy, configuration, output, and exit-code behavior documented on this page.

For general pre-commit integration guidance, CI workflows, and repository hook configuration, see Pre-commit integration.

topmark strip - remove detected TopMark headers instead of inserting or updating them.
topmark probe - explain file-type and processor resolution.
topmark config check - validate the effective runtime configuration and report diagnostics.
topmark config dump - inspect the effective runtime configuration, including normalized file type identifiers.

Troubleshooting¶

No files to process: Ensure you passed positional paths, or selected the correct STDIN mode (--files-from - for list mode, or - with --stdin-filename for content mode). Use -vv for detailed TEXT rendering; use logging options for internal debug logs.
Patterns do not match: Remember that include/exclude patterns are relative to CWD. cd into the project root before running.
Symlink path not shown in output: check processes selected processing paths. If a symlink and its target resolve to the same file, machine-readable output and generated header metadata describe the resolved target rather than the symlink spelling.
Hard-linked files are reported as unsupported: check blocks processing when multiple selected paths refer to the same filesystem object through hard links. Each affected path is reported independently; no preferred path is selected from the hard-link group.
File type filter does not match: use topmark probe to inspect resolution decisions, and prefer qualified identifiers such as topmark:python when local identifiers may be ambiguous.
Missing file error: A literal path such as fubar.py is treated as an explicit input and fails with FILE_NOT_FOUND (66) when it does not exist. Use a glob pattern when an empty match set should be non-fatal.
Unexpected placement: For pound/slash formats, check for leading banners or shebang/encoding lines. For XML/HTML, verify declaration/doctype positions.

topmark check¶

Quick start¶

Input applicability¶

STDIN modes¶

Configuration and validation¶

Filtering and file discovery¶

File type filters¶

Path-based filters¶

Example¶

Command-specific policy options¶

Empty file behavior¶

Formatting and safety¶

Shared policy¶

Behavior details¶

Output behavior¶

Shared output controls¶

Machine-readable output¶

JSON schema (detail mode)¶

JSON schema (summary mode)¶

NDJSON schema (detail vs summary)¶

Command-specific options¶

Exit codes¶

Typical workflows¶

1) Add headers to a project¶

2) Review a change set¶

3) CI: summarize and fail when changes are needed¶

4) Run with strict config checking¶

Pre-commit integration¶

Related commands¶

Related docs¶

Troubleshooting¶

`topmark check`¶