Configuration schema summary¶
This page summarizes TopMark's stable external configuration schema as consumed from topmark.toml
and [tool.topmark] in pyproject.toml.
Note
- This page is a schema summary, not a full JSON Schema.
- The ordering mirrors
src/topmark/toml/topmark-example.toml. - Keys are defined authoritatively in
src/topmark/toml/keys.py.
TopMark internally maintains staged validation diagnostics, but public reporting, machine-readable output, and API surfaces expose a flattened compatibility view.
For the stable 1.x line, this flattened compatibility view is the machine-readable and API compatibility contract.
\
Note
Human-facing TEXT verbosity (-v) and quiet mode (--quiet) are presentation-layer concerns.
They do not affect:
- configuration schema validation
- staged diagnostics
- machine-readable output
- API surfaces
Markdown and machine-readable output always expose the full flattened compatibility view.
See also:
See Terminology and Canonical Vocabulary for the normative definitions of machine-readable output, canonical identity, applicability, and staged diagnostics terminology.
Note
Internal helper types such as PolicyOverrides and
ConfigOverrides are not part of the stable public
API surface. They are internal runtime orchestration helpers used by the CLI and public API
wrappers.
Public callers should pass plain mapping-based inputs through config=..., policy=..., and
policy_by_type=... instead of constructing these objects directly.
At the configuration-schema layer, override handling is represented as plain mapping data.
Internal typed runtime override objects are introduced later during CLI/API orchestration and are not part of the stable external configuration schema.
File type identifiers in TOML configuration may use either:
- local identifiers such as
python - canonical qualified file type identities such as
topmark:python
TopMark normalizes identifiers to canonical qualified keys during configuration normalization before resolver, filtering, policy, and binding evaluation.
Local identifiers are accepted only when unambiguous in the effective composed registry.
strict is a TOML-source-local config-loading option, not a layered configuration field. It is
resolved from [config] / [tool.topmark.config] during TOML source resolution and applied after
layered configuration merging.
Its effective value governs staged config-loading validation across TOML-source, merged-config, and runtime-applicability diagnostics.
This distinction matters for
topmark config dump --show-layers:
- the human-facing layered TOML export exposes source-local TOML fragments under
[[layers]].toml.* - the machine-readable layered export exposes the same source-local TOML fragments under
config_provenance.layers[].toml
For the canonical user-facing discovery, precedence, path-resolution, and staged validation contract, see Configuration discovery, precedence, and policy.
Schema validation model¶
TopMark performs whole-source TOML schema validation before any layered configuration is deserialized:
- unknown top-level sections (e.g.
[foo]) are reported as TOML validation issues - missing known sections are reported as INFO diagnostics
- unknown keys within known sections (e.g.
[config].bogus) are also reported - validation is source-local and happens per TOML file during loading
After this step, only the layered configuration fragment is passed to
MutableConfig for parsing and normalization before
freezing into the immutable layered configuration snapshot.
At this boundary, diagnostics remain staged; flattening into the public compatibility view is performed only at reporting, exception, machine-readable output, and API boundaries.
This reporting boundary is independent of human presentation controls: TEXT verbosity (-v) and
quiet mode (--quiet) only influence how diagnostics are rendered in console output, not how they
are produced, staged, or exposed through machine/API interfaces.
For the stable 1.x line, staged validation remains internal, while public reporting and machine/API surfaces expose only the flattened compatibility diagnostics contract.
At the TOML layer, malformed known sections are handled as warning-and-ignore diagnostics, while missing known sections are emitted as INFO diagnostics.
This allows callers to distinguish absent sections from malformed-present sections before staged config-validation semantics are applied.
These TOML-source diagnostics then participate together with merged-config and runtime-applicability diagnostics during staged config-loading validation.
Note
- TOML schema validation is handled in
topmark.toml - file type identifier normalization and ambiguity evaluation are performed during configuration normalization and runtime-applicability validation
- configuration value/type validation is handled in
topmark.configas staged validation logs (merged-config and runtime-applicability stages) - layered config deserialization
(
mutable_config_from_layered_toml_table) assumes schema validation already happened, but still performs defensive parsing for API and test inputs
The following summary uses a YAML-like notation for readability and is not itself a machine-readable schema definition.
topmark:
# In layered provenance exports, source-local TOML fragments preserve their
# original TOML grouping, including `[config]` and `[writer]`, rather than
# collapsing everything into the final flattened Config payload.
config:
type: table
description: TOML-source-local options resolved during TOML loading, not part of layered
configuration merging.
root:
type: bool
default: false
description: Stop upward config discovery when set in a discovered config.
strict:
type: bool
default: false
description: Source-local strictness preference applied to staged config-loading validation;
warnings become failures when effective strict config checking is enabled across TOML-source,
merged-config, and runtime-applicability diagnostics.
header:
fields:
type: list[str]
default: ["file", "file_relpath"]
description: Header metadata fields to render (order preserved).
fields:
type: table
default: {}
description: User-defined header field values (e.g., project/license/copyright).
formatting:
align_fields:
type: bool
default: true
description: Align header field labels/colons.
relative_to:
type: path
default: "."
description: Affects header metadata (file_relpath), not discovery.
writer:
type: table
description: TOML-source-local writer options (not part of layered configuration state).
strategy:
type: str
default: "atomic"
enum: ["atomic", "inplace"]
description: How file writes are performed when writing back to files.
policy:
header_mutation_mode:
type: str
default: "all"
enum: ["all", "add_only", "update_only"]
description: Controls check mutation intent: insert and update headers, insert missing headers only, or update existing headers only. Safety gates still take precedence.
allow_header_in_empty_files:
type: bool
default: false
description: Allow inserting headers into files considered empty under the effective empty insertion policy.
empty_insert_mode:
type: str
default: "logical_empty"
enum: ["bytes_empty", "logical_empty", "whitespace_empty"]
description: Control how TopMark classifies files as empty for header insertion.
render_empty_header_when_no_fields:
type: bool
default: false
description: Allow inserting an empty header when no header fields are configured.
allow_reflow:
type: bool
default: false
description: Allow content reflow during header insertion or update.
allow_content_probe:
type: bool
default: true
description: Allow file-type detection to inspect file contents when needed.
policy_by_type:
type: table
default: {}
description: Per-file-type policy overrides, keyed by local or canonical qualified file type identifiers.
additionalProperties:
# Examples:
#
# [policy_by_type.python]
# [policy_by_type."topmark:python"]
#
# Identifiers normalize to canonical qualified keys.
header_mutation_mode:
type: str
optional: true
enum: ["all", "add_only", "update_only"]
description: Per-file-type override for check mutation intent.
allow_header_in_empty_files:
type: bool
optional: true
empty_insert_mode:
type: str
optional: true
enum: ["bytes_empty", "logical_empty", "whitespace_empty"]
render_empty_header_when_no_fields:
type: bool
optional: true
allow_reflow:
type: bool
optional: true
allow_content_probe:
type: bool
optional: true
files:
# Filtering order:
# 1) Path filters (include/exclude patterns + *_from + files_from)
# 2) File type filters (include_file_types / exclude_file_types)
# 3) Eligibility (supported vs unsupported)
include_patterns:
type: list[str]
default: []
description: Glob patterns to include (relative to declaring config source).
exclude_patterns:
type: list[str]
default: []
description: Glob patterns to exclude (relative to declaring config source).
include_from:
type: list[path]
default: []
description: Files containing include patterns (one per line; comments allowed).
exclude_from:
type: list[path]
default: []
description: Files containing exclude patterns (one per line; comments allowed).
files_from:
type: list[path]
default: []
description: Files containing explicit file lists (one path per line; comments allowed).
include_file_types:
type: list[str]
default: []
description: Restrict processing to these local or canonical qualified file type identifiers.
exclude_file_types:
type: list[str]
default: []
description: Exclude these local or canonical qualified file type identifiers.
files:
type: list[path]
default: []
description: Input paths (files/directories) to scan; commonly provided via CLI.
At runtime, file type identifiers normalize to canonical qualified keys before:
- resolution and filtering
- policy lookup
- processor binding lookup
- probe evaluation
- API overlay application
This normalization behavior is shared consistently across:
- TOML configuration
- CLI options
- API overlays
- effective runtime policy resolution
Identifier ambiguity¶
Local identifiers such as:
are accepted only when they remain unambiguous in the effective composed registry.
If multiple file types share the same local identifier, callers must use the canonical qualified form:
Malformed identifiers participate in staged config-loading validation diagnostics.
Policy token notes¶
header_mutation_mode¶
header_mutation_mode uses TOML/API tokens with underscores:
all: insert missing headers and update existing headersadd_only: insert missing headers only; existing headers are not updatedupdate_only: update existing headers only; missing headers are not inserted
The equivalent CLI values use hyphens for the non-default modes: add-only and update-only.
This policy affects only the check pipeline behavior.
It affects dry-run reporting, apply behavior, API result views, and outcome bucketing.
It does not apply to strip or probe,
and safety gates still take precedence: malformed headers, unreadable files, unsupported files,
blocked filesystem states, and other non-mutable conditions are not made mutable by this policy.
Non-goals¶
The configuration schema intentionally does not support:
- fuzzy matching for file type identifiers
- implicit namespace fallback
- automatic alias expansion
- silent ambiguity resolution
- plugin-specific schema mutation during config loading
Identifier handling intentionally remains explicit, deterministic, and ambiguity-aware.