Common filtering recipes¶
Filtering controls determine stable runtime behavior such as:
- which paths participate in discovery
- which file types are eligible for processing
- how explicit inputs participate in semantic runtime outcomes
- how probe diagnostics are reported
TopMark determines which files to process using a combination of path-based filters and file type filters.
Note
The canonical vocabulary used throughout the documentation is defined in Terminology and Canonical Vocabulary.
Filtering overview¶
Filtering and discovery semantics are shared consistently across:
topmark checktopmark striptopmark probe- TOML configuration
- API overlays
- runtime-resolution and probe filtering
TopMark applies filtering in a deterministic order:
- Path-based discovery and filtering
- File-type filtering
- Runtime applicability evaluation
- Runtime processor resolution
Exclude rules take precedence over include rules.
For canonical file-type identifier semantics, see File-type filtering. For layered configuration behavior, see Configuration.
Note
For topmark probe, paths excluded during step 1 or 2 may still be reported
as filtered semantic outcomes when they were explicitly requested inputs.
Runtime filtering boundaries¶
TopMark intentionally separates:
- path discovery
- path filtering
- file-type filtering
- runtime applicability evaluation
- runtime probing and processor resolution
Each stage consumes the finalized results of the previous stage.
This layered filtering model keeps runtime behavior deterministic while preserving stable probe diagnostics and machine-readable filtering semantics.
Missing vs unmatched inputs¶
TopMark distinguishes between explicit literal paths and glob patterns:
- Explicit missing literal paths (e.g.,
fubar.py) are treated as hard input errors and result inFILE_NOT_FOUND (66). - Unmatched glob patterns (e.g.,
missing/**/*.py) are treated as soft runtime-discovery diagnostics and do not cause a failure for processing commands (check,strip) (exitSUCCESS (0)).
This distinction ensures that typos in explicit inputs are surfaced, while flexible patterns that match nothing do not cause runtime processing-command failures.
Path-based filtering¶
TopMark supports the following path-based filtering controls:
--include,--excludeInclude or exclude glob patterns.--include-from,--exclude-fromLoad patterns from files (one per line).--files-fromProvide an explicit list of files to process.
Stable path-filtering semantics:
- Positional arguments are resolved relative to the current working directory (CWD), Black-style.
- Patterns in
--include,--exclude, and files referenced by--include-from/--exclude-fromare also resolved relative to CWD. - Absolute patterns are not supported.
- Exclude rules take precedence over include rules.
- Path-based filtering occurs before file-type filtering.
STDIN support¶
File-processing commands support two STDIN modes when supplying file lists or content:
- List mode: provide newline-delimited paths or patterns via:
--files-from ---include-from ---exclude-from -- Content mode: process a single virtual runtime file from STDIN content by passing
-as the sole PATH together with--stdin-filename NAME
See shared input modes for the full STDIN contract,
including why TopMark does not provide a --stdin option flag.
Interaction with topmark probe¶
The topmark probe command uses the same runtime filtering pipeline and
discovery semantics described above.
This includes:
- path filtering
- file-type filtering
- canonical file-type identifier normalization and resolution
- ambiguity handling
However, unlike processing commands (check, strip),
probe also reports **explicit inputs that were filtered out before runtime
file-type probing.
Additionally, probe treats unmatched glob patterns as filtered semantic
outcomes rather than silent runtime no-ops. As a result:
- Unmatched glob patterns are reported as
filteredprobe results (e.g.,filtered: excluded_by_discovery_filter). - The command exits with
UNSUPPORTED_FILE_TYPE (69), reflecting incomplete runtime semantic resolution.
This differs from processing commands, which treat unmatched patterns as non-fatal diagnostics.
probe is read-only and diagnostic-only. It shares discovery and filtering
behavior with check and strip, but rejects mutation,
diff, reporting, and header-generation options that do not apply.
For example, when a path is excluded via --exclude or exclude_patterns,
topmark probe will still show it in the output as:
In machine-readable JSON and NDJSON output, these are represented as structured probe results with:
{
"status": "filtered",
"reason": "excluded_by_path_filter",
"selected_file_type": null,
"selected_processor": null,
"candidates": []
}
Filtered probe results may use one of the following reasons:
excluded_by_path_filter- excluded by path-based include/exclude rulesexcluded_by_file_type_filter- excluded by file-type include/exclude rulesexcluded_by_discovery_filter- excluded before runtime probing, but exact category not identifiedno_candidates- no file-type candidates were found (e.g., unsupported extension)
Only explicitly requested runtime inputs (CLI paths or --files-from) are reported this way. Files
excluded implicitly during recursive discovery are not enumerated.
Filtering recipes¶
Recipe: Process only Python and Markdown¶
CLI:
Equivalent canonical form:
TOML:
Recipe: Exclude generated/virtualenv folders¶
TOML:
[files]
exclude_patterns = [
".venv/**",
"**/__pycache__/**",
"**/.mypy_cache/**",
"**/.pytest_cache/**",
"dist/**",
"build/**",
]
Recipe: Include only src/ and tests/¶
TOML:
Recipe: Use include/exclude pattern files (portable across repos)¶
These files may also be provided via STDIN by using - as the file path.
Example include.txt:
Example exclude.txt:
Recipe: Exclude a specific file type after path filtering¶
Equivalent canonical form:
Recipe: Process only an explicit file list (from Git)¶
Generate a file list:
Then:
You can also stream the file list via STDIN:
Recipe: Show only actionable files (would change)¶
Recipe: Include unsupported files in reporting¶
File-type filtering¶
TopMark supports file-type include/exclude filtering via:
--include-file-types / -t--exclude-file-types / -Tinclude_file_typesexclude_file_types
File-type filters are evaluated after path-based filtering.
TopMark accepts file type identifiers in local form, such as python, or qualified form, such as
topmark:python.
Local identifiers are accepted only when unambiguous. Internally, TopMark normalizes identifiers to canonical qualified file type identities before filtering, runtime resolution, policy evaluation, diagnostics, and registry lookup.
Plugins and integrations may declare file types in their own namespace, such as acme:python. This
allows independent ecosystems to define custom file types and register independent runtime header
processors without colliding with built-in TopMark identifiers.
Local identifiers are accepted only when they are unambiguous. If more than one registered file type has the same local identifier, the local form is considered ambiguous and TopMark requires the qualified form.
Exit-code interaction¶
Filtering decisions can influence exit codes indirectly:
- Missing explicit inputs →
FILE_NOT_FOUND (66) - Unmatched glob patterns → no failure (
check/strip,SUCCESS (0)), orUNSUPPORTED_FILE_TYPE (69)inprobe
Missing explicit inputs take precedence over semantic runtime probe outcomes.
When multiple conditions occur, TopMark applies a deterministic exit-code priority model (see Exit Codes documentation), where hard input and filesystem errors take precedence.
Invalid CLI usage (for example, unsupported options or inappropriate STDIN modes) is reported as a usage error and takes precedence over filtering outcomes.
Notes on configuration strictness¶
Filtering determines which runtime files participate in processing, while staged config-loading validation determines whether a run is allowed to proceed.
Note
[config].strict is a TOML-source-local strictness preference controlling staged
configuration-loading validation for the current TOML source.
Effective strictness is evaluated across:
- TOML-source diagnostics;
- merged-config diagnostics;
- runtime applicability diagnostics.
strict is resolved during TOML loading and does not become a layered configuration field.
Effective strictness is controlled by:
- CLI override (
--strict/--no-strict) - TOML setting (
strict) - default non-strict behavior
When strict config checking is enabled, configuration-loading validation warnings are treated as errors and may cause the command to fail before processing files.