Skip to content

topmark.config.model

topmark / config / model

Configuration model and merge policy.

This module defines
  • FrozenConfig: an immutable layered configuration snapshot used by processing steps.
  • MutableConfig: a mutable builder used during discovery/merge; it can be frozen into an immmutable FrozenConfig and thawed back for edits.
Scope
Immutability
Path semantics
  • Path-to-file options declared in config are normalized against that config file's directory.
  • CLI path-to-file options are normalized against the invocation CWD.
  • relative_to influences header metadata (e.g., file_relpath) only, not glob expansion.
Testing guidance
  • Unit-test merge behavior with synthetic builders (no I/O).
  • Exercise TOML/discovery paths in loader/discovery tests.
Validation semantics
  • Config validity is evaluated across staged config-loading validation logs: TOML-source diagnostics, merged-config diagnostics, and runtime-applicability diagnostics.
  • In non-strict mode, validation fails only when at least one stage contains an error diagnostic.
  • In strict mode, validation fails when any stage contains either a warning or an error diagnostic.

FrozenConfig dataclass

FrozenConfig(
    *,
    policy,
    policy_by_type,
    config_files,
    header_fields,
    field_values,
    align_fields,
    relative_to_raw,
    relative_to,
    files,
    include_from,
    exclude_from,
    files_from,
    include_pattern_groups,
    exclude_pattern_groups,
    include_file_types,
    exclude_file_types,
    validation_logs,
)

Immutable layered configuration for TopMark.

This snapshot is produced by MutableConfig.freeze() after merging defaults, project files, extra config files, and config-like API overrides. Collections are immutable (tuple/frozenset) to prevent accidental mutation during processing. Use FrozenConfig.thaw() to obtain a mutable builder for edits, and MutableConfig.freeze() to return to an immutable layered snapshot.

Layered merging with clear precedence is provided by the config-resolution helpers in topmark.config.resolution.

Attributes:

Name Type Description
policy FrozenPolicy

Global, resolved, immutable runtime policy (plain booleans), applied after discovery.

policy_by_type Mapping[str, FrozenPolicy]

Per-file-type resolved policy overrides (plain booleans), applied after discovery.

config_files tuple[Path | SyntheticConfigSource, ...]

List of paths or identifiers for config sources used.

header_fields tuple[str, ...]

List of header fields from the [header] section.

field_values Mapping[str, str]

Mapping of field names to their string values from [fields].

align_fields bool | None

Whether to align fields, from [formatting].

relative_to_raw str | None

Original string from config, API or CLI.

relative_to Path | None

Base path used only for header metadata (e.g., file_relpath). Note: Glob expansion and filtering are resolved relative to their declaring source (config file dir or CWD for CLI), not relative_to.

files tuple[str, ...]

List of files to process.

include_from tuple[PatternSource, ...]

Files containing include patterns.

exclude_from tuple[PatternSource, ...]

Files containing exclude patterns.

files_from tuple[PatternSource, ...]

Paths to files that list newline-delimited candidate file paths to add before filtering.

include_pattern_groups tuple[PatternGroup, ...]

Glob patterns to include.

exclude_pattern_groups tuple[PatternGroup, ...]

Glob patterns to exclude.

include_file_types frozenset[str]

Whitelist of file type identifiers to restrict file discovery.

exclude_file_types frozenset[str]

Blacklist of file type identifiers to exclude from file discovery.

validation_logs FrozenValidationLogs

Stage-aware diagnostics collected during config loading and preflight validation.

Policy resolution

is_valid

is_valid(*, strict=None)

Return whether this config is valid.

Validity follows the staged config-loading validation semantics described in this module.

Here, strict is the effective resolved strictness used for config/preflight validation. Callers typically derive it from strict after applying TOML resolution and any CLI/API override precedence.

A similar helper exists on MutableConfig.

Parameters:

Name Type Description Default
strict bool | None

Effective strictness for config/preflight validation.

None

Returns:

Type Description
bool

True if the config is valid, else False.

Source code in src/topmark/config/model.py
def is_valid(
    self,
    *,
    strict: bool | None = None,
) -> bool:
    """Return whether this config is valid.

    Validity follows the staged config-loading validation semantics
    described in this module.

    Here, `strict` is the effective resolved strictness used for
    config/preflight validation. Callers typically derive it from
    `strict` after applying TOML resolution and any CLI/API
    override precedence.

    A similar helper exists on [`MutableConfig`][topmark.config.model.MutableConfig].

    Args:
        strict: Effective strictness for config/preflight validation.

    Returns:
        `True` if the config is valid, else `False`.
    """
    return _is_validation_logs_valid(
        self.validation_logs,
        strict=bool(strict),
    )

ensure_valid

ensure_valid(*, strict=None)

Raise ConfigValidationError if this config is not valid.

Validity follows the staged config-loading validation semantics described in this module.

Here, strict is the effective resolved strictness used for config/preflight validation. Callers typically derive it from strict after applying TOML resolution and any CLI/API override precedence.

A similar helper exists on MutableConfig.

Parameters:

Name Type Description Default
strict bool | None

Effective strictness for config/preflight validation.

None

Raises:

Type Description
ConfigValidationError

If the config is invalid.

Source code in src/topmark/config/model.py
def ensure_valid(
    self,
    *,
    strict: bool | None = None,
) -> None:
    """Raise `ConfigValidationError` if this config is not valid.

    Validity follows the staged config-loading validation semantics
    described in this module.

    Here, `strict` is the effective resolved strictness used for
    config/preflight validation. Callers typically derive it from
    `strict` after applying TOML resolution and any CLI/API
    override precedence.

    A similar helper exists on [`MutableConfig`][topmark.config.model.MutableConfig].

    Args:
        strict: Effective strictness for config/preflight validation.

    Raises:
        ConfigValidationError: If the config is invalid.
    """
    if not self.is_valid(strict=strict):
        raise ConfigValidationError(
            validation_logs=self.validation_logs,
            strict=bool(strict),
        )

thaw

thaw()

Return a mutable copy of this immutable config.

Symmetry

Mirrors MutableConfig.freeze(). Prefer thaw→edit→freeze rather than mutating a runtime FrozenConfig.

Returns:

Type Description
MutableConfig

A mutable builder initialized from this snapshot.

Source code in src/topmark/config/model.py
def thaw(self) -> MutableConfig:
    """Return a mutable copy of this immutable config.

    Symmetry:
        Mirrors [`MutableConfig.freeze()`][topmark.config.model.MutableConfig.freeze].
        Prefer thaw→edit→freeze rather than mutating a runtime
        [`FrozenConfig`][topmark.config.model.FrozenConfig].

    Returns:
        A mutable builder initialized from this snapshot.
    """
    validation_logs: MutableValidationLogs = self.validation_logs.thaw()
    return MutableConfig(
        policy=self.policy.thaw(),
        policy_by_type={k: v.thaw() for k, v in self.policy_by_type.items()},
        config_files=list(self.config_files),
        header_fields=list(self.header_fields),
        field_values=dict(self.field_values),
        align_fields=self.align_fields,
        relative_to_raw=self.relative_to_raw,
        relative_to=self.relative_to,
        files=list(self.files),
        include_from=list(self.include_from),
        exclude_from=list(self.exclude_from),
        include_pattern_groups=list(self.include_pattern_groups),
        exclude_pattern_groups=list(self.exclude_pattern_groups),
        files_from=list(self.files_from),
        include_file_types=set(self.include_file_types),
        exclude_file_types=set(self.exclude_file_types),
        validation_logs=validation_logs,
    )

MutableConfig dataclass

MutableConfig(
    *,
    policy=MutablePolicy(),
    policy_by_type=(lambda: {})(),
    config_files=(lambda: [])(),
    header_fields=(lambda: [])(),
    field_values=(lambda: {})(),
    align_fields=None,
    relative_to_raw=None,
    relative_to=None,
    files=(lambda: [])(),
    include_from=(lambda: [])(),
    exclude_from=(lambda: [])(),
    files_from=(lambda: [])(),
    include_pattern_groups=(lambda: [])(),
    exclude_pattern_groups=(lambda: [])(),
    include_file_types=(lambda: set[str]())(),
    exclude_file_types=(lambda: set[str]())(),
    validation_logs=MutableValidationLogs(),
)

Mutable configuration used during discovery and merging.

This builder collects layered config from defaults, project files, extra files, and config-like API overrides. It remains convenient to mutate (list/set), then produces an immutable FrozenConfig via freeze. TOML I/O is delegated to topmark.config.io to keep this class focused on merge policy.

Attributes:

Name Type Description
policy MutablePolicy

Optional global policy overrides (public shape).

policy_by_type dict[str, MutablePolicy]

Optional per-type policy.

config_files list[Path | SyntheticConfigSource]

List of paths or identifiers for config sources used.

header_fields list[str]

List of header fields from the [header] section.

field_values dict[str, str]

Mapping of field names to their string values from [fields].

align_fields bool | None

Whether to align fields, from [formatting].

relative_to_raw str | None

Original string from config or CLI

relative_to Path | None

Base path used only for resolving header metadata (e.g., file_relpath).

files list[str]

List of files to process.

include_from list[PatternSource]

Files containing include patterns.

exclude_from list[PatternSource]

Files containing exclude patterns.

files_from list[PatternSource]

Paths to files that list newline-delimited candidate file paths to add before filtering.

include_pattern_groups list[PatternGroup]

Glob patterns to include.

exclude_pattern_groups list[PatternGroup]

Glob patterns to exclude.

include_file_types set[str]

file type identifiers to process.

exclude_file_types set[str]

file type identifiers to exclude.

validation_logs MutableValidationLogs

Stage-aware diagnostics collected during config loading and preflight validation.

is_valid

is_valid(*, strict=None)

Return whether this mutable config is valid.

Validity follows the staged config-loading validation semantics described in this module.

Here, strict is the effective resolved strictness used for config/preflight validation. Callers typically derive it from strict after applying TOML resolution and any CLI/API override precedence.

A similar helper exists on FrozenConfig.

Parameters:

Name Type Description Default
strict bool | None

Effective strictness for config/preflight validation.

None

Returns:

Type Description
bool

True if the mutable config is valid, else False.

Source code in src/topmark/config/model.py
def is_valid(
    self,
    *,
    strict: bool | None = None,
) -> bool:
    """Return whether this mutable config is valid.

    Validity follows the staged config-loading validation semantics
    described in this module.

    Here, `strict` is the effective resolved strictness used for
    config/preflight validation. Callers typically derive it from
    `strict` after applying TOML resolution and any CLI/API
    override precedence.

    A similar helper exists on [`FrozenConfig`][topmark.config.model.FrozenConfig].

    Args:
        strict: Effective strictness for config/preflight validation.

    Returns:
        `True` if the mutable config is valid, else `False`.
    """
    return _is_validation_logs_valid(
        self.validation_logs,
        strict=bool(strict),
    )

ensure_valid

ensure_valid(*, strict=None)

Raise ConfigValidationError if this mutable config is not valid.

Validity follows the staged config-loading validation semantics described in this module.

Here, strict is the effective resolved strictness used for config/preflight validation. Callers typically derive it from strict after applying TOML resolution and any CLI/API override precedence.

A similar helper exists on FrozenConfig.

Parameters:

Name Type Description Default
strict bool | None

Effective strictness for config/preflight validation.

None

Raises:

Type Description
ConfigValidationError

If the mutable config is invalid.

Source code in src/topmark/config/model.py
def ensure_valid(
    self,
    *,
    strict: bool | None = None,
) -> None:
    """Raise `ConfigValidationError` if this mutable config is not valid.

    Validity follows the staged config-loading validation semantics
    described in this module.

    Here, `strict` is the effective resolved strictness used for
    config/preflight validation. Callers typically derive it from
    `strict` after applying TOML resolution and any CLI/API
    override precedence.

    A similar helper exists on [`FrozenConfig`][topmark.config.model.FrozenConfig].

    Args:
        strict: Effective strictness for config/preflight validation.

    Raises:
        ConfigValidationError: If the mutable config is invalid.
    """
    if not self.is_valid(strict=strict):
        raise ConfigValidationError(
            validation_logs=self.validation_logs,
            strict=bool(strict),
        )

freeze

freeze()

Freeze this mutable builder into an immutable FrozenConfig.

This method applies final sanitation and normalizes internal container types before constructing the immutable FrozenConfig snapshot.

Source code in src/topmark/config/model.py
def freeze(self) -> FrozenConfig:
    """Freeze this mutable builder into an immutable `FrozenConfig`.

    This method applies final sanitation and normalizes internal container
    types before constructing the immutable [`FrozenConfig`][topmark.config.model.FrozenConfig]
    snapshot.
    """
    self.sanitize()

    # Resolve global policy against an all-false base
    global_policy_frozen: FrozenPolicy = self.policy.resolve(FrozenPolicy())

    # Resolve per-type policies against the resolved global policy
    frozen_by_type: dict[str, FrozenPolicy] = {}
    for ft, mp in self.policy_by_type.items():
        resolved: FrozenPolicy = mp.resolve(global_policy_frozen)
        frozen_by_type[ft] = resolved

    return FrozenConfig(
        policy=global_policy_frozen,
        policy_by_type=frozen_by_type,
        config_files=tuple(self.config_files),
        header_fields=tuple(self.header_fields),
        field_values=dict(self.field_values),
        align_fields=self.align_fields,
        relative_to_raw=self.relative_to_raw,
        relative_to=self.relative_to,
        files=tuple(self.files),
        include_from=tuple(self.include_from),
        exclude_from=tuple(self.exclude_from),
        files_from=tuple(self.files_from),
        include_pattern_groups=tuple(self.include_pattern_groups),
        exclude_pattern_groups=tuple(self.exclude_pattern_groups),
        include_file_types=frozenset(self.include_file_types),
        exclude_file_types=frozenset(self.exclude_file_types),
        validation_logs=self.validation_logs.freeze(),
    )

merge_with

merge_with(other)

Return a new draft that merges self with a higher-precedence other draft.

Merge behavior is field-specific rather than uniformly "last wins". The current policy follows TopMark's layered-config mental model:

  • provenance and diagnostics accumulate
  • behavioral/configuration fields usually use nearest-wins semantics
  • mapping fields usually overlay keys
  • discovery pattern groups accumulate across layers
  • runtime/execution intent is out of scope for layered config merging
Current merge groups

Provenance and diagnostics: - config_files: append - validation_logs: append within each validation stage

Behavioral config: - header_fields: replace when other provides a non-empty list - align_fields: replace only when explicitly set in other - relative_to_raw, relative_to: replace only when explicitly set in other

Policy: - policy: tri-state field merge via MutablePolicy.merge_with() - policy_by_type: key-wise merge, then tri-state merge per key

Field values: - field_values: key-wise overlay; other wins on overlapping keys

Discovery inputs: - include_pattern_groups, exclude_pattern_groups: append - include_from, exclude_from, files_from: append - files: replace when other provides a non-empty list

Discovery filters: - include_file_types, exclude_file_types: replace when other provides a non-empty set

Parameters:

Name Type Description Default
other MutableConfig

Higher-precedence config whose values should be merged on top of this draft.

required

Returns:

Type Description
MutableConfig

A new mutable configuration representing the merged result.

Source code in src/topmark/config/model.py
def merge_with(self, other: MutableConfig) -> MutableConfig:
    """Return a new draft that merges ``self`` with a higher-precedence ``other`` draft.

    Merge behavior is field-specific rather than uniformly "last wins". The
    current policy follows TopMark's layered-config mental model:

    - provenance and diagnostics **accumulate**
    - behavioral/configuration fields usually use **nearest-wins** semantics
    - mapping fields usually **overlay keys**
    - discovery pattern groups **accumulate** across layers
    - runtime/execution intent is out of scope for layered config merging

    Current merge groups:
        Provenance and diagnostics:
            - `config_files`: append
            - `validation_logs`: append within each validation stage

        Behavioral config:
            - `header_fields`: replace when `other` provides a non-empty list
            - `align_fields`: replace only when explicitly set in `other`
            - `relative_to_raw`, `relative_to`: replace only when explicitly set in `other`

        Policy:
            - `policy`: tri-state field merge via `MutablePolicy.merge_with()`
            - `policy_by_type`: key-wise merge, then tri-state merge per key

        Field values:
            - `field_values`: key-wise overlay; `other` wins on overlapping keys

        Discovery inputs:
            - `include_pattern_groups`, `exclude_pattern_groups`: append
            - `include_from`, `exclude_from`, `files_from`: append
            - `files`: replace when `other` provides a non-empty list

        Discovery filters:
            - `include_file_types`, `exclude_file_types`: replace when `other`
              provides a non-empty set

    Args:
        other: Higher-precedence config whose values should be merged on top
            of this draft.

    Returns:
        A new mutable configuration representing the merged result.
    """
    # --------------------------- Provenance and policies ---------------------------
    # Merge global policy using tri-state semantics so explicit child values override
    # matching parent values without collapsing `None` too early.
    merged_global: MutablePolicy = self.policy.merge_with(other.policy)

    # Merge per-type policies key-wise, then tri-state merge per shared key.
    merged_by_type: dict[str, MutablePolicy] = {}
    all_policy_keys: set[str] = set(self.policy_by_type.keys()) | set(
        other.policy_by_type.keys()
    )
    for key in all_policy_keys:
        base: MutablePolicy | None = self.policy_by_type.get(key)
        override: MutablePolicy | None = other.policy_by_type.get(key)
        if base is None:
            if override is not None:
                merged_by_type[key] = override
        elif override is None:
            merged_by_type[key] = base
        else:
            merged_by_type[key] = base.merge_with(override)

    # Provenance accumulates across layers. Validation diagnostics also
    # accumulate, but remain separated by stage so flattened diagnostics can
    # be derived later at reporting or output boundaries.
    merged_config_files: list[Path | SyntheticConfigSource] = [
        *self.config_files,
        *other.config_files,
    ]

    merged_validation_logs: MutableValidationLogs = self.validation_logs.merge_with(
        other.validation_logs
    )

    # ------------------------ Behavioral config ------------------------
    merged_header_fields: list[str] = other.header_fields or self.header_fields
    merged_align_fields: bool | None = (
        other.align_fields if other.align_fields is not None else self.align_fields
    )
    merged_relative_to_raw: str | None = (
        other.relative_to_raw if other.relative_to_raw is not None else self.relative_to_raw
    )
    merged_relative_to: Path | None = (
        other.relative_to if other.relative_to is not None else self.relative_to
    )

    # ----------------------------- Mapping-style overlays ----------------------------
    # Field values use key-wise overlay semantics: unrelated parent keys remain
    # inherited while matching child keys override.
    merged_field_values: dict[str, str] = {
        **self.field_values,
        **other.field_values,
    }

    # -------------------------------- Discovery inputs -------------------------------
    # Discovery pattern groups always accumulate across applicable layers.
    merged_include_pattern_groups: list[PatternGroup] = [
        *self.include_pattern_groups,
        *other.include_pattern_groups,
    ]
    merged_exclude_pattern_groups: list[PatternGroup] = [
        *self.exclude_pattern_groups,
        *other.exclude_pattern_groups,
    ]

    # Path-to-file discovery sources now accumulate across layers as well.
    merged_include_from: list[PatternSource] = [*self.include_from, *other.include_from]
    merged_exclude_from: list[PatternSource] = [*self.exclude_from, *other.exclude_from]
    merged_files_from: list[PatternSource] = [*self.files_from, *other.files_from]

    # Explicit file lists remain authoritative: nearest applicable non-empty list wins.
    merged_files: list[str] = other.files or self.files

    # File-type filters express a nearest-scope decision rather than a union.
    merged_include_file_types: set[str] = other.include_file_types or self.include_file_types
    merged_exclude_file_types: set[str] = other.exclude_file_types or self.exclude_file_types

    logger.info(
        "Merging config layers: adding %r to existing config_files %r",
        other.config_files,
        self.config_files,
    )

    merged = MutableConfig(
        config_files=merged_config_files,
        validation_logs=merged_validation_logs,
        header_fields=merged_header_fields,
        field_values=merged_field_values,
        align_fields=merged_align_fields,
        relative_to_raw=merged_relative_to_raw,
        relative_to=merged_relative_to,
        files=merged_files,
        include_from=merged_include_from,
        exclude_from=merged_exclude_from,
        files_from=merged_files_from,
        include_pattern_groups=merged_include_pattern_groups,
        exclude_pattern_groups=merged_exclude_pattern_groups,
        include_file_types=merged_include_file_types,
        exclude_file_types=merged_exclude_file_types,
    )

    merged.policy = merged_global
    merged.policy_by_type = merged_by_type
    return merged

sanitize

sanitize()

Normalize and sanitize draft config in-place.

This step enforces downstream invariants expected by config resolution, runtime processing, and related components such as the file resolver, pipeline, and CLI. It is intended to be called just before freezing into an immutable FrozenConfig.

Sanitization may drop or rewrite invalid entries and records diagnostics describing those recoveries. These diagnostics are part of the runtime-applicability validation stage and continue to participate in config/preflight validity checks, including strict config checking.

Current rules
  • include_from / exclude_from / files_from entries must refer to concrete files, not glob-style paths. Any PatternSource.path containing glob metacharacters (*, ?, [, ]) is ignored with a warning.
  • include_file_types / exclude_file_types entries are resolved from public local-or-qualified identifiers to canonical qualified keys.
  • policy_by_type entries are resolved from public local-or-qualified identifiers to canonical qualified keys.
Future extensions may
  • validate relative_to vs. config_files,
  • check existence of pattern files,
  • normalize duplicate patterns or sources.
Source code in src/topmark/config/model.py
def sanitize(self) -> None:
    """Normalize and sanitize draft config in-place.

    This step enforces downstream invariants expected by config resolution,
    runtime processing, and related components such as the file resolver,
    pipeline, and CLI. It is intended to be called just before freezing into
    an immutable [`FrozenConfig`][topmark.config.model.FrozenConfig].

    Sanitization may drop or rewrite invalid entries and records diagnostics
    describing those recoveries. These diagnostics are part of the
    runtime-applicability validation stage and continue to participate in
    config/preflight validity checks, including strict config checking.

    Current rules:
        - include_from / exclude_from / files_from entries must refer to
          concrete files, not glob-style paths. Any `PatternSource.path`
          containing glob metacharacters (*, ?, [, ]) is ignored with a warning.
        - include_file_types / exclude_file_types entries are resolved from
          public local-or-qualified identifiers to canonical qualified keys.
        - policy_by_type entries are resolved from public local-or-qualified
          identifiers to canonical qualified keys.

    Future extensions may:
        - validate relative_to vs. config_files,
        - check existence of pattern files,
        - normalize duplicate patterns or sources.
    """

    def _has_glob_chars(p: Path) -> bool:
        s: str = str(p)
        return any(ch in s for ch in "*?[]")

    def _sanitize_sources(name: str, sources: list[PatternSource]) -> None:
        if not sources:
            return
        kept: list[PatternSource] = []
        for ps in sources:
            if _has_glob_chars(ps.path):
                msg: str = (
                    f"Ignoring {name} entry with glob characters in path: {ps.path} "
                    "(these options expect concrete files; use "
                    "include_patterns / exclude_patterns for globs)."
                )
                self.validation_logs.runtime_applicability.add_warning(msg)
                continue
            kept.append(ps)

        if len(kept) != len(sources):
            msg = (
                f"Sanitized {name}: kept {len(kept)} source(s), "
                f"dropped {len(sources) - len(kept)} invalid source(s)"
            )
            self.validation_logs.runtime_applicability.add_warning(msg)

        sources[:] = kept

    _sanitize_sources(Toml.KEY_INCLUDE_FROM, self.include_from)
    _sanitize_sources(Toml.KEY_EXCLUDE_FROM, self.exclude_from)
    _sanitize_sources(Toml.KEY_FILES_FROM, self.files_from)

    def _resolve_file_type_id(file_type_id: str) -> str | None:
        """Return the canonical qualified key for a public file type identifier.

        Public configuration accepts either a qualified identifier such as
        `topmark:python` or an unqualified local identifier such as `python`.
        Local identifiers are accepted only when they are unambiguous in the
        effective file type registry.

        Args:
            file_type_id: Public file type identifier from config, CLI, or API
                overrides.

        Returns:
            The canonical qualified file type key, or `None` when the
            identifier is empty or unknown.

        Raises:
            AmbiguousFileTypeIdentifierError: If an unqualified local
                identifier matches more than one registered file type.
            InvalidRegistryIdentityError: If a qualified identifier is
                malformed.
        """  # noqa: DOC502 - documents propagated exceptions from delegated registry helpers
        if not file_type_id:
            return None

        # Local import to keep config import-safe and avoid incidental cycles.
        from topmark.registry.filetypes import FileTypeRegistry

        file_type: FileType | None = FileTypeRegistry.resolve_filetype_id(file_type_id)
        if file_type is None:
            return None
        return file_type.qualified_key

    def _sanitize_file_type_ids(
        name: str,
        ids: set[str],
        *,
        is_exclusion: bool,
    ) -> None:
        """Resolve file type filters to canonical qualified keys.

        Unknown, malformed, or ambiguous identifiers are ignored and recorded
        as runtime-applicability validation diagnostics. Valid identifiers
        are rewritten in-place to their canonical qualified keys.

        Args:
            name: Human-readable name for diagnostics, e.g.
                `include_file_types`.
            ids: Mutable set of public identifiers to normalize in-place.
            is_exclusion: Whether this selector is an exclusion filter.
        """
        if not ids:
            return

        normalized: set[str] = set()
        ignored: list[str] = []

        for file_type_id in sorted(ids):
            try:
                qualified_key: str | None = _resolve_file_type_id(file_type_id)
            except AmbiguousFileTypeIdentifierError as exc:
                candidates: str = ", ".join(exc.candidates)
                self.validation_logs.runtime_applicability.add_warning(
                    f"Ambiguous {name} file type identifier ignored: "
                    f"{file_type_id} (candidates: {candidates})"
                )
                ignored.append(file_type_id)
                continue
            except InvalidRegistryIdentityError:
                self.validation_logs.runtime_applicability.add_warning(
                    f"Malformed {name} file type identifier ignored: {file_type_id}"
                )
                ignored.append(file_type_id)
                continue

            if qualified_key is None:
                ignored.append(file_type_id)
                continue

            normalized.add(qualified_key)

        if ignored:
            unknown_str: str = ", ".join(ignored)
            if is_exclusion:
                msg: str = f"Unknown excluded file types specified (ignored): {unknown_str}"
            else:
                msg = f"Unknown included file types specified (ignored): {unknown_str}"
            self.validation_logs.runtime_applicability.add_warning(msg)

        if ids != normalized:
            ids.clear()
            ids.update(normalized)

    _sanitize_file_type_ids(
        Toml.KEY_INCLUDE_FILE_TYPES,
        self.include_file_types,
        is_exclusion=False,
    )
    _sanitize_file_type_ids(
        Toml.KEY_EXCLUDE_FILE_TYPES,
        self.exclude_file_types,
        is_exclusion=True,
    )

    # If a type appears in both include and exclude, prefer exclusion.
    overlap: set[str] = self.include_file_types & self.exclude_file_types
    if overlap:
        overlap_str: str = ", ".join(sorted(overlap))
        msg: str = (
            "File types specified in both include and exclude filters; "
            f"exclusion wins (removed from include): {overlap_str}"
        )
        self.validation_logs.runtime_applicability.add_warning(msg)
        # Remove overlaps (blacklisted wins from whitelisted):
        self.include_file_types.difference_update(overlap)

    def _sanitize_policy_by_type() -> None:
        """Resolve per-type policy keys to canonical qualified file type keys.

        Public `policy_by_type` keys accept the same identifier forms as file
        type filters: qualified keys such as `topmark:python`, or local keys
        such as `python` when unambiguous. The immutable
        [`FrozenConfig`][topmark.config.model.FrozenConfig] stores only
        canonical qualified keys.
        """
        if not self.policy_by_type:
            return

        normalized: dict[str, MutablePolicy] = {}

        for file_type_id, policy in self.policy_by_type.items():
            try:
                qualified_key: str | None = _resolve_file_type_id(file_type_id)
            except AmbiguousFileTypeIdentifierError as exc:
                candidates: str = ", ".join(exc.candidates)
                self.validation_logs.runtime_applicability.add_warning(
                    "Ambiguous policy_by_type file type identifier ignored: "
                    f"{file_type_id} (candidates: {candidates})"
                )
                continue
            except InvalidRegistryIdentityError:
                self.validation_logs.runtime_applicability.add_warning(
                    f"Malformed policy_by_type file type identifier ignored: {file_type_id}"
                )
                continue

            if qualified_key is None:
                self.validation_logs.runtime_applicability.add_warning(
                    f"Unknown policy_by_type file type identifier ignored: {file_type_id}"
                )
                continue

            existing: MutablePolicy | None = normalized.get(qualified_key)
            if existing is None:
                normalized[qualified_key] = policy
            else:
                normalized[qualified_key] = existing.merge_with(policy)

        self.policy_by_type = normalized

    _sanitize_policy_by_type()

sanitized_config

sanitized_config(config)

Sanitize a FrozenConfig object.

Thaws the FrozenConfig into a MutableConfig, sanitizes and freezes again.

Sanitization may add staged validation diagnostics, and those diagnostics participate in later config/preflight validity checks.

Parameters:

Name Type Description Default
config FrozenConfig

The FrozenConfig to sanitize.

required

Returns:

Type Description
FrozenConfig

The sanitized FrozenConfig instance.

Source code in src/topmark/config/model.py
def sanitized_config(config: FrozenConfig) -> FrozenConfig:
    """Sanitize a `FrozenConfig` object.

    Thaws the [`FrozenConfig`][topmark.config.model.FrozenConfig] into a
    [`MutableConfig`][topmark.config.model.MutableConfig], sanitizes and freezes again.

    Sanitization may add staged validation diagnostics, and those diagnostics
    participate in later config/preflight validity checks.

    Args:
        config: The [`FrozenConfig`][topmark.config.model.FrozenConfig] to sanitize.

    Returns:
        The sanitized [`FrozenConfig`][topmark.config.model.FrozenConfig] instance.
    """
    m: MutableConfig = config.thaw()
    m.sanitize()
    return m.freeze()