Documentation pipeline and reference hygiene¶
This page documents how TopMark's stable documentation is generated, validated, and kept consistent
using the tooling under tools/docs/.
Note
The canonical vocabulary used throughout the documentation is defined in Terminology and Canonical Vocabulary.
It is intended for contributors and maintainers working on:
- API documentation
- Internal architecture docs
- Docstring quality and reference hygiene
- MkDocs and
mkdocs-gen-filesintegration
This page focuses on the documentation-generation and validation pipeline itself rather than general documentation authoring conventions. Detailed writing conventions, workflow-page structure, heading policy, and snippet usage rules are documented in Documentation Conventions.
Scope of this document¶
This page documents:
- generated documentation architecture;
- MkDocs integration and generation hooks;
- API and docstring scanning behavior;
- documentation hygiene validation;
- snippet and draft handling;
- strict-mode and validation behavior;
- reference-hygiene enforcement.
It intentionally does not redefine authoring conventions already covered by:
Overview¶
This pipeline supports both:
- the stable public API documentation (
topmark.api) - internal module documentation (
topmark.*)
and is aligned with TopMark's layered runtime and configuration architecture:
- TOML → FrozenConfig → runtime → pipeline
See Architecture for the conceptual overview.
TopMark's documentation build consists of three coordinated layers:
- Handwritten Markdown
- Located under
docs/ - Includes DEV documentation, guides, and architecture notes
- Generated Markdown
- Produced at build time by
mkdocs-gen-files - Includes API internals, public API reference pages, and CLI reference output
- Build-time validation and hygiene
- Enforced via MkDocs hooks, custom tooling, and shared helpers
- Ensures symbol references, snippet includes, and generated pages remain consistent, deterministic, and maintainable
All tooling lives under:
and is executed only during documentation builds.
Documentation validation is also integrated into local contributor workflows, CI verification, and
stable-release validation through make verify, nox, and GitHub Actions.
Relationship to CI and validation tooling¶
Documentation validation is intentionally layered and deterministic:
- MkDocs performs rendering-time validation;
tools/docs/performs deterministic repository hygiene and prose-hygiene checks;make verifyandnoxintegrate documentation validation into contributor workflows;- GitHub Actions enforce documentation validation in CI.
See also:
Generated documentation¶
Internals pages¶
Generated by gen_api_pages.py under:
Characteristics:
- One page per importable module under
src/topmark/ - Breadcrumb navigation reflecting the package hierarchy
- Per-package
index.mdpages listing immediate children - A grouped index under
api/internals/topmark/index.md
Exact public API surfaces (defined in PUBLIC_API_PREFIXES) are not generated as internals pages,
because generating both a public reference page and an internals page for the same module would
create duplicate mkdocs-autorefs anchors.
Public reference pages¶
These pages correspond to the stable public API surface defined by topmark.api.__all__ and are
covered by the API snapshot stability contract.
Generated under:
For modules listed in:
These pages represent stable supported public API surfaces.
Public API stability expectations and snapshot validation are documented in:
CLI reference pages¶
Generated from live TopMark output:
via:
Generated CLI reference pages are therefore treated as derived release artifacts rather than handwritten documentation.
Relationship to documentation conventions¶
The documentation pipeline enforces generated-page consistency and validation behavior, while stable writing and structure conventions are documented separately.
Authoring conventions include:
- heading structure;
- snippet conventions;
- workflow-page templates;
- related-pages conventions;
- heading-style policy;
- Markdown organization rules.
See:
Docstring scanning and reference hygiene¶
Both handwritten Markdown and Python module docstrings are scanned for unlinked backticked symbol references, such as:
Docstring scanning is performed on raw Python source files before mkdocstrings renders them into
Markdown, which ensures reported line numbers always refer to the original src/... files.
The shared enforcement logic lives in:
and is used identically by:
hooks.py(Markdown scanning)gen_api_pages.py(docstring scanning)
Why this matters¶
This helps ensure that documentation remains navigable and that symbol references stay valid even as internal modules evolve.
mkdocs-autorefscan only resolve symbols that are properly linked;- backticked-but-unlinked symbols silently break cross-references;
- docstrings are rendered into the generated documentation and must follow the same hygiene rules as handwritten Markdown.
What is considered a symbol¶
A candidate is enforced when it:
- looks like a dotted Python path;
- starts with
topmark.; - is not a filename (
.toml,.yaml, ...); - is not explicitly whitelisted.
This logic lives in:
Whitelisting non-linkable symbols¶
Some backticked identifiers are intentional and should not be linked.
These are allowed through an explicit exact-match whitelist:
Rules:
- Exact matches only (no prefixes)
- Applies to Markdown and docstrings
- Logged in debug mode for transparency
Logging, debug, and strict modes¶
Two environment variables control documentation-validation behavior:
TOPMARK_DOCS_DEBUG¶
When enabled:
- Emits detailed DEBUG/INFO logs
- Shows:
- Rendered-on context
- Edit URLs (for Markdown)
- Alternate inline-link suggestions (
Alt:) - Full symbol lists (no truncation)
TOPMARK_DOCS_STRICT_REFS¶
When enabled:
- the build fails if any unlinked symbols are found;
- failures are aggregated and reported after processing all pages;
mkdocs.exceptions.Abortis used for clean termination.
Severity behavior remains intentionally consistent between:
hooks.py(Markdown scanning)gen_api_pages.py(docstring scanning)
Contextual logging¶
All diagnostics aim to be actionable.
Depending on origin, logs include:
- Local repo paths (
docs/...orsrc/...) - Line numbers
- Rendered-on pages
- Edit URLs (when available)
Context lines are built centrally via:
Drafts and snippets¶
Draft files¶
Files under:
are:
- Ignored by MkDocs navigation
- Ignored by version control
- Safe for work-in-progress documentation
- optionally visible when serving documentation locally (marked as draft)
Markdown snippets¶
Files under:
are:
- intended for inclusion via plugins such as
include-markdown; - not standalone pages;
- explicitly excluded via
exclude_docsinmkdocs.yml; - intended only for stable reusable documentation fragments.
Markdown documentation hygiene is validated through:
which runs:
Python code-prose hygiene is validated separately through:
The Markdown hygiene validation performs repository-hygiene checks for:
- broken include paths;
- malformed docs-root-relative include paths;
- include targets resolving outside
docs/; - nested snippet includes;
- accidental macOS
._*files under documentation sources; - Markdown files under
docs/missing frommkdocs.ymlnavigation; - emoji in Markdown headings;
- missing section separators between level-2 headings.
The checker also reports maintainability warnings for:
- orphaned snippets;
- headings inside snippets;
- smart punctuation in Markdown prose;
- relative links inside reusable snippets unless include-markdown link rewriting is intentional;
- snippet include paths that do not use the formatter-stable
\_snippets/prefix.
Shared navigation snippets such as related-pages*.md are intentionally allowed to contain relative
links because they centralize reusable documentation navigation behavior.
check_code_hygiene.py complements the Markdown-focused checks by scanning Python comments,
docstrings, and prose-oriented string literals under src/topmark/, tests/, and tools/. It
currently enforces ASCII-oriented punctuation hygiene for terminal-safe, deterministic, and
copy/paste-friendly generated documentation and CLI output.
These checks intentionally remain lightweight and repository-focused. They reinforce repository-wide documentation consistency without turning every style preference into a hard release blocker.
Design principles¶
The documentation tooling follows a few strict principles:
- Deterministic - no hidden state and no reliance on import order.
- Fail-late, report-all - especially in strict mode.
- Shared logic, single source of truth - no duplicated include semantics, prose-hygiene rules, or validation heuristics.
- Documentation is code - docstrings, Markdown, and generated reference material are held to the same standard.
Summary¶
- documentation is generated, validated, and enforced as part of the build;
tools/docs/is the authoritative location for documentation tooling;- reference hygiene, Markdown hygiene, and Python prose hygiene are enforced consistently across documentation sources, comments, and docstrings;
- debug and strict modes provide both flexibility and CI-grade guarantees.
If you change how TopMark is structured, update the documentation pipeline accordingly - it is a stable and intentionally maintained part of the project architecture.