Registry Model¶
TopMark uses a layered registry architecture to manage:
- file type identities
- header processor identities
- bindings between file types and processors
- runtime overlays and extensions
- resolver and probe integration
The registry model is explicit, deterministic, overlay-based, and composition-oriented. Identity registration and processor binding are separate operations.
This page owns the detailed registry model. The broader system architecture is documented in Architecture overview.
Note
The canonical vocabulary used throughout the documentation is defined in Terminology and Canonical Vocabulary.
Runtime model overview¶
The runtime registry model is primarily composed of:
FileTypeHeaderProcessor- binding relationships managed through
BindingRegistry
These runtime registry objects participate in stable runtime behavior such as:
- file type resolution
- processor dispatch
- policy lookup
- pipeline execution
- CLI introspection
- machine-readable output rendering
Note
User-facing documentation intentionally focuses on stable runtime behavior and public CLI
contracts rather than internal implementation objects.
Advanced registry behavior and overlay mutation semantics are documented here for maintainers, plugin authors, and advanced integrators.
Design goals¶
The registry model exists to make TopMark extensible without making runtime behavior implicit or order-dependent.
Earlier process-global mutable registries made tests order-dependent and blurred the distinction between introspection and mutation.
The current model keeps base registry data immutable and confines mutation to explicit overlay state.
The main goals are:
- deterministic behavior across CLI, API, tests, and documentation generation;
- safe extensibility for plugins and tests;
- clear separation between introspection and mutation;
- efficient composition of effective runtime registry views;
- test isolation for registry overlays;
- a single effective registry view for resolver, pipeline, API, and CLI behavior.
Base registries and overlays¶
TopMark composes effective runtime registries from immutable base registry data plus mutable overlay state.
Base registries contain:
- built-in file types;
- discovered file type plugins;
- built-in processor definitions;
- built-in file-type-to-processor bindings.
Overlay state contains process-local additions and removals requested by tests, plugins, runtime extensions, or advanced integrations.
The effective composed runtime registry view is:
Base registries are not mutated by overlay operations.
This allows TopMark to keep built-in registry state immutable while still supporting runtime extension and isolated tests.
flowchart TB
subgraph BASE[Base registries]
BFT["Base FileTypes<br/>(built-ins + plugins)"]
BPR["Base Processors<br/>(built-ins)"]
BBD["Base Bindings<br/>(built-ins)"]
end
subgraph OVER[Overlay state]
OFT["FileType overlays"]
OPR["Processor overlays"]
OBD["Binding overlays"]
end
BFT --> EFT["Effective FileType view"]
OFT --> EFT
BPR --> EPR["Effective Processor view"]
OPR --> EPR
BBD --> EBD["Effective Binding view"]
OBD --> EBD
EFT --> RES["File type resolution"]
EBD --> BIND["Processor binding lookup"]
EPR --> BIND
RES --> BIND
Registry layers¶
TopMark separates identity registries from relationship registries.
This separation is part of the stable 1.x registry architecture contract.
FileTypeRegistry¶
FileTypeRegistry manages file type identities.
Each file type has:
- namespace
- local key
- qualified key
- extensions
- resolver and matching metadata
Examples of local identifiers:
Examples of canonical qualified identifiers:
TopMark normalizes file type identifiers to canonical qualified keys.
Local identifiers are accepted only when unambiguous.
HeaderProcessorRegistry¶
HeaderProcessorRegistry manages header
processor identities.
Processors remain independent from file types. This allows:
- multiple file types to share a processor;
- processor bindings to change without redefining file types;
- runtime overlays and plugin integration.
BindingRegistry¶
BindingRegistry manages relationships between file
types and processors.
Bindings define:
- which processor is selected for a file type;
- whether a recognized file type is supported;
- which processor participates in header operations.
This separation prevents implicit side effects between identity registration and processor binding.
Registry facade¶
Registry provides the stable read-only facade over the
effective composed runtime registries.
The facade exposes immutable effective composed runtime registry views.
The stable public-facing runtime facade is:
Most integrations should prefer the facade rather than interacting directly with advanced registry mutation APIs.
Examples:
from topmark.registry.registry import Registry
for ft in Registry.filetypes().values():
print(ft.qualified_key)
from topmark.registry.registry import Registry
for binding in Registry.bindings():
print(binding.file_type_key, binding.processor_key)
Public facade vs advanced registries¶
The stable public-facing runtime registry entry point is:
It exposes read-only effective views and is suitable for introspection.
The advanced registries are:
These registries provide overlay mutation helpers such as registration, unregistration, binding, and unbinding. They are intended for:
- tests;
- plugins;
- advanced integrations.
Overlay mutation helpers affect overlay state only. They do not mutate immutable built-in or plugin-discovered base registry entries.
Qualified vs local identifiers¶
TopMark accepts file type identifiers in either:
- local form (
python); - qualified form (
topmark:python).
Identifiers normalize to canonical qualified keys.
Local identifiers are accepted only when unambiguous.
If multiple registered file types share the same local identifier, callers must use the qualified form.
Examples:
In this situation:
is ambiguous.
Use:
instead.
Advanced registry-facing APIs normalize and resolve identifiers through
FileTypeRegistry.resolve_filetype_id(...),
which returns the matching FileType instance from the
effective composed runtime registry.
flowchart LR
INPUT["Public identifier<br/>python or topmark:python"]
RESOLVE["FileTypeRegistry.resolve_filetype_id(...)"]
FT["FileType<br/>qualified_key = topmark:python"]
RUNTIME["Resolver, filters,<br/>policy lookup, bindings"]
INPUT --> RESOLVE --> FT --> RUNTIME
Recognized vs supported file types¶
A file type is recognized if its file type identifier exists in
FileTypeRegistry.
A file type is supported if it is recognized and has an effective binding through
BindingRegistry to a registered processor
definition in HeaderProcessorRegistry.
A file type may be recognized but still unbound.
- it participates in discovery and filtering;
- it may appear in results depending on the selected report scope;
- no header insertion or removal is attempted.
Resolver integration¶
The resolver and probe system operate on canonical qualified file type identities.
This affects:
- include/exclude file-type filters;
- policy lookup;
- runtime bindings;
- probe diagnostics;
- CLI filtering;
- API overlays.
Resolver and probe APIs:
Plugin integration¶
File type plugins are discovered through the topmark.filetypes entry point
group.
Plugin-defined file types participate in the same composed runtime registry and identifier semantics as built-in file types.
Plugin authors should:
- use a stable namespace such as
acmeormy_plugin; - choose clear local keys such as
django_htmlormy_lang; - document and use qualified identifiers such as
acme:django_htmlin shared examples; - avoid relying on local identifiers remaining unambiguous as ecosystems grow.
Header processor plugins currently use advanced runtime-overlay integration semantics. They should bind processor definitions to canonical qualified file type identifiers.
For a plugin-focused guide, see Plugins and extensibility.
Registry composition¶
The effective runtime registry is always derived from immutable base registry data plus overlay state.
Overlay mutations never mutate built-in or plugin-discovered base entries directly. Instead, TopMark recomposes effective runtime registry views from:
This composition-oriented architecture keeps runtime behavior deterministic while still supporting tests, plugins, runtime extensions, and advanced integrations.
Runtime overlays¶
Advanced integrations may register runtime overlay mutations. Overlay mutations invalidate composed effective-view caches as described in Caching and invalidation.
Examples include:
- plugins;
- tests;
- temporary runtime bindings;
- integration-specific file types.
Overlay mutations affect only overlay state layered on top of immutable base registry data.
Overlay operations may:
- register or unregister file types;
- register or unregister processors;
- bind or unbind processors to file types.
Overlay mutations are:
- process-local;
- overlay-only;
- thread-safe;
- cache-invalidating.
They do not mutate built-in or plugin-discovered base registry entries.
Overlay state exists specifically to support:
- isolated tests;
- temporary runtime extensions;
- advanced integration scenarios;
- plugin composition without mutating built-ins.
After overlay mutation, the next effective registry read recomposes the effective runtime view from:
Most integrations should prefer the stable Registry facade
and avoid direct overlay mutation unless runtime extension behavior is explicitly required.
Caching and invalidation¶
Base registries are cached because construction and plugin discovery should happen once per process.
Composed effective views are also cached for fast repeated access.
Any overlay mutation invalidates the composed effective-view cache. The next call to an effective
view, such as as_mapping() or the Registry facade,
recomposes the view from base registry data and overlay state.
Practical consequences:
- overlay mutations remain lightweight;
- registry reads remain fast;
- tests that mutate overlays must clean them up;
- callers do not need to manage composed cache invalidation manually.
sequenceDiagram
autonumber
participant Caller
participant FTR as FileTypeRegistry
participant HPR as HeaderProcessorRegistry
participant BR as BindingRegistry
Caller->>FTR: register()/unregister()
FTR->>FTR: update overlays
FTR->>FTR: invalidate composed cache
Caller->>HPR: register()/unregister()
HPR->>HPR: update overlays
HPR->>HPR: invalidate composed cache
Caller->>BR: bind()/unbind()
BR->>BR: update overlays
BR->>BR: invalidate composed cache
Note over Caller,BR: Later...
Caller->>FTR: as_mapping()
FTR->>FTR: compose effective view
FTR-->>Caller: cached mapping
Caller->>HPR: as_mapping()
HPR->>HPR: compose effective view
HPR-->>Caller: cached mapping
Caller->>BR: as_mapping()
BR->>BR: compose effective view
BR-->>Caller: cached mapping
Runtime extension example¶
from topmark.registry.bindings import BindingRegistry
from topmark.registry.filetypes import FileTypeRegistry
from topmark.registry.processors import HeaderProcessorRegistry
# Register file type identity.
FileTypeRegistry.register(ft)
# Register processor identity.
proc_def = HeaderProcessorRegistry.register(
processor_class=MyProcessor,
)
# Bind file type to processor.
BindingRegistry.bind(
file_type_key=ft.qualified_key,
processor_key=proc_def.qualified_key,
)
Cleanup should reverse the same steps explicitly:
BindingRegistry.unbind(ft.qualified_key)
HeaderProcessorRegistry.unregister(proc_def.qualified_key)
FileTypeRegistry.unregister(ft.qualified_key)
When registering processors against file type identities, prefer qualified file type identifiers
such as topmark:python or my_plugin:django_html once multiple namespaces are in play. Local
identifiers remain supported when unambiguous, but may become ambiguous as extensions are added.
For long-term or redistributable extensions, prefer publishing a plugin using the
topmark.filetypes entry point group.
Registry CLI commands¶
TopMark provides registry inspection commands.
Examples:
These commands expose the effective composed runtime registry view.
Use:
for available subcommands and output options.
Why not per-run registries?¶
Registries are intentionally process-global rather than threaded through every runtime layer as per-run registry objects.
Reasons include:
- registry contents affect discovery, resolution, bindings, and pipeline execution;
- threading registry objects through every API would significantly complicate the runtime model;
- most users do not need per-run registry customization;
- overlay mutation already provides explicit runtime-extension behavior when required.
Configuration controls which file types participate in a run.
Registries control which file types, processors, and bindings exist in the effective runtime environment.
Non-goals¶
The registry model is not designed to provide:
- transactional registry mutation in production code;
- fuzzy matching for file type identifiers;
- implicit namespace fallback or fuzzy namespace resolution;
- silent mutation of built-in or plugin-provided base entries;
- per-run registry objects passed through every runtime layer.
Configuration controls which file types are selected for a run. Registries control what file types, processors, and bindings exist in the effective runtime environment.
Stability model¶
The stable public API surface is defined by:
topmark.api- the CLI contract
- documented DTOs and result views
Registry internals are documented for maintainers and advanced integrators, but registry overlay
mutation behavior intentionally remains more flexible than the stable topmark.api
execution API.
Most integrations should prefer:
topmark.apiRegistry- probe APIs
rather than mutating advanced registries directly.