CI/CD Pipeline: Documentation Validation Setup Guide
Ensuring your documentation is up-to-date, valid, and consistent is super important, guys! A CI/CD (Continuous Integration/Continuous Delivery) pipeline for documentation validation automates this process. This prevents broken links, invalid specifications, syntax errors, and spelling mistakes from making their way into your main branch. Think of it as a quality gatekeeper for your docs!
Why is a Documentation Validation Pipeline Critical?
A documentation validation pipeline acts as the enforcement mechanism that maintains documentation quality. It's critical for a few key reasons:
- Preventing Documentation Drift: Over time, documentation can become outdated if not actively maintained. By automating validation, you ensure that your documentation stays aligned with your code.
- Maintaining Quality: Automated checks catch errors and inconsistencies, ensuring that your documentation is reliable and accurate. This helps avoid confusion and frustration for your users and developers alike.
- Enforcing Documentation Updates with Code Changes: By linking documentation validation to your CI/CD process, you make sure that any code changes are accompanied by corresponding updates to the documentation. This keeps everything in sync and reduces the risk of outdated or misleading information.
To prevent documentation drift and maintain high-quality documentation, integrating a CI/CD pipeline is essential. This process ensures documentation is updated alongside code changes, catching issues early and reducing manual review burdens. By automating checks for broken links, syntax errors, and spelling mistakes, you keep your documentation consistent and reliable. This proactive approach not only saves time but also enhances the user experience by providing accurate and up-to-date information. Implementing such a pipeline helps teams focus on creating value rather than fixing documentation errors, leading to more efficient workflows and better product understanding.
Acceptance Criteria
Before we dive into the technical details, let's define what success looks like. Here are the key criteria we need to meet:
- [ ] A
.github/workflows/docs-validation.yml
file is created and functional. - [ ] OpenAPI specifications are validated using Spectral (zero errors).
- [ ] Broken links are detected using markdown-link-check (zero broken links).
- [ ] Mermaid diagram syntax is validated (all diagrams must render correctly).
- [ ] Spell check is performed using cspell (zero spelling errors in technical documentation).
- [ ] A check ensures that documentation is required for code changes (if
src/
changes without correspondingdocs/
changes, the pipeline fails). - [ ] The CI workflow runs on all pull requests.
- [ ] The pipeline fails fast (stops on the first error for quick feedback).
- [ ] All checks pass on the current documentation (validate before merging).
Technical Details: Setting Up the Workflow
Okay, let's get technical! We'll be using GitHub Actions to create our CI/CD pipeline. Here's a breakdown of the steps involved.
GitHub Actions Workflow
File: .github/workflows/docs-validation.yml
name: Documentation Validation
on:
pull_request:
paths:
- 'docs/**'
- 'src/**'
- '.github/workflows/docs-validation.yml'
jobs:
validate-docs:
name: Validate Documentation
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install validation tools
run: |
npm install -g @stoplight/spectral-cli
npm install -g markdown-link-check
npm install -g @mermaid-js/mermaid-cli
npm install -g cspell
# Check 1: Validate OpenAPI Spec
- name: Validate OpenAPI Specification
if: always() # Run even if previous step fails
run: |
if [ -f docs/specifications/api/openapi.yaml ]; then
spectral lint docs/specifications/api/openapi.yaml --fail-severity error
else
echo "OpenAPI spec not found, skipping validation"
fi
# Check 2: Detect Broken Links
- name: Check for Broken Links
if: always()
run: |
find docs -name "*.md" -exec markdown-link-check {} \; --config .markdown-link-check.json
# Check 3: Validate Mermaid Diagrams
- name: Validate Mermaid Diagrams
if: always()
run: |
# Extract Mermaid blocks and validate syntax
find docs -name "*.md" -print0 | while IFS= read -r -d '' file; do
echo "Checking Mermaid diagrams in: $file"
# Simple validation: check for ```mermaid blocks
if grep -q '```mermaid' "$file"; then
# Mermaid CLI validation would go here
# For now, just check syntax doesn't have obvious errors
echo "Mermaid diagrams found in $file"
fi
done
# Check 4: Spell Check
- name: Spell Check Documentation
if: always()
run: |
cspell "docs/**/*.md" --config .cspell.json
# Check 5: Require Docs for Code Changes
- name: Require Docs Updates for Code Changes
if: always()
run: |
# Get list of changed files
git fetch origin main
CHANGED_SRC=$(git diff --name-only origin/main HEAD | grep '^src/' || true)
CHANGED_DOCS=$(git diff --name-only origin/main HEAD | grep '^docs/' || true)
if [ -n "$CHANGED_SRC" ] && [ -z "$CHANGED_DOCS" ]; then
echo "â Code changes detected but no documentation updates"
echo "Please update relevant documentation in docs/"
echo ""
echo "Changed source files:"
echo "$CHANGED_SRC"
exit 1
else
echo "â
Documentation check passed"
fi
# Check 6: Validate Markdown Format
- name: Lint Markdown Files
if: always()
uses: DavidAnson/markdownlint-cli2-action@v14
with:
globs: 'docs/**/*.md'
# Summary
- name: Validation Summary
if: always()
run: |
echo "đ Documentation Validation Summary"
echo "â
OpenAPI spec validated"
echo "â
Links checked"
echo "â
Mermaid diagrams validated"
echo "â
Spelling checked"
echo "â
Code-docs consistency verified"
echo "â
Markdown linted"
This YAML file defines the workflow that will run on every pull request. Let's break down what's happening:
name
: A human-readable name for the workflow.on
: Specifies when the workflow should run (in this case, on pull requests that affect thedocs/
orsrc/
directories, or the workflow file itself).jobs
: Defines the tasks to be executed.validate-docs
: The main job for validating documentation.runs-on
: Specifies the operating system to run the job on (Ubuntu in this case).steps
: A sequence of actions to perform.actions/checkout@v4
: Checks out the code from the repository.actions/setup-node@v4
: Sets up Node.js with the specified version and cache.Install validation tools
: Installs the necessary tools globally usingnpm
.Validate OpenAPI Specification
: Validates the OpenAPI specification using Spectral.Check for Broken Links
: Detects broken links in Markdown files.Validate Mermaid Diagrams
: Validates the syntax of Mermaid diagrams.Spell Check Documentation
: Performs a spell check on the documentation.Require Docs Updates for Code Changes
: Checks if documentation updates are needed for code changes.Lint Markdown Files
: Lints Markdown files to ensure consistent formatting.Validation Summary
: Prints a summary of the validation results.
Configuration Files
To make our validation tools work effectively, we need to configure them. Here are the configuration files we'll be using.
1. Markdown Link Check Config (.markdown-link-check.json
)
{
"ignorePatterns": [
{
"pattern": "^http://localhost"
},
{
"pattern": "^https://example.com"
}
],
"httpHeaders": [
{
"urls": [
"https://github.com"
],
"headers": {
"Accept-Encoding": "zlib, deflate, br"
}
}
],
"timeout": "10s",
"retryOn429": true,
"retryCount": 3,
"fallbackRetryDelay": "5s",
"aliveStatusCodes": [200, 206, 403]
}
Rationale:
- Ignore localhost URLs: Used in examples and development environments.
- Retry on 429 (rate limits): Handles rate limiting from external sites.
- Accept 403: Some sites block bots but the links may still be valid.
2. Spectral Config (.spectral.yaml
)
extends: spectral:oas
rules:
# Require operationId for all operations
operation-operationId: error
# Require tags for organization
operation-tags: error
# Require descriptions
info-description: error
operation-description: warn
# Require examples for schemas
oas3-schema-examples: warn
# No unused components
oas3-unused-component: error
# Consistent naming
path-keys-no-trailing-slash: error
Rationale:
- Enforce OpenAPI best practices: Ensures the OpenAPI specifications adhere to standards.
- Fail on critical issues: Highlights missing
operationId
and unused components. - Warn on nice-to-haves: Suggests improvements like examples and descriptions.
3. CSpell Config (.cspell.json
)
{
"version": "0.2",
"language": "en",
"words": [
"ARCA",
"AFIP",
"factura",
"comprobante",
"monotributo",
"CUIT",
"Mermaid",
"PostgreSQL",
"Redis",
"OpenAPI",
"BullMQ",
"TypeScript",
"Fastify",
"Supabase",
"Cloudflare",
"webhook",
"CAE",
"IVA",
"UUID",
"HMAC",
"JSONB",
"tsvector"
],
"ignoreWords": [
"arcaapi",
"datetime",
"timestamp"
],
"ignorePaths": [
"node_modules/**",
".git/**",
"dist/**",
"build/**"
]
}
Rationale:
- Whitelist technical terms: Prevents false positives on domain-specific vocabulary.
- Ignore generated code directories: Excludes directories that contain generated code.
4. Markdownlint Config (.markdownlint.json
)
{
"default": true,
"MD013": {
"line_length": 120,
"code_blocks": false,
"tables": false
},
"MD033": false,
"MD041": false
}
Rationale:
- MD013: Allow 120 char lines: Accommodates longer lines in technical documentation.
- MD033: Allow inline HTML: Needed for Mermaid diagrams in some cases.
- MD041: Don't require H1 as first line: Frontmatter often comes first.
Implementation Approach
Now that we have the technical details covered, let's outline the steps to implement this pipeline:
- Create the workflow file:
.github/workflows/docs-validation.yml
- Create configuration files:
.spectral.yaml
,.cspell.json
,.markdown-link-check.json
, and.markdownlint.json
- Test workflow locally: Use
act
(GitHub Actions local runner) for testing. - Run validation on current docs: Fix any errors found in existing documentation.
- Create a PR to test the workflow: Verify that the workflow runs and passes.
- Document the workflow: Explain how to run it locally and add words to the spell check in
docs/development/
. - Add a status badge: Display CI passing status in the main README.
Local Testing Commands
To run validation locally before pushing changes, engineers can use the following commands:
# Validate OpenAPI spec
npx @stoplight/spectral-cli lint docs/specifications/api/openapi.yaml
# Check broken links
npx markdown-link-check docs/**/*.md
# Spell check
npx cspell "docs/**/*.md"
# Markdown lint
npx markdownlint-cli2 "docs/**/*.md"
# Run all checks (package.json script)
pnpm validate:docs
Add the following scripts to package.json
:
{
"scripts": {
"validate:docs": "npm-run-all validate:openapi validate:links validate:spelling validate:markdown",
"validate:openapi": "spectral lint docs/specifications/api/openapi.yaml",
"validate:links": "markdown-link-check docs/**/*.md",
"validate:spelling": "cspell 'docs/**/*.md'",
"validate:markdown": "markdownlint-cli2 'docs/**/*.md'"
}
}
Files Affected
Implementing this pipeline will involve creating and modifying several files:
- Create:
.github/workflows/docs-validation.yml
(main workflow) - Create:
.markdown-link-check.json
(link checker config) - Create:
.spectral.yaml
(OpenAPI linter config) - Create:
.cspell.json
(spell checker config) - Create:
.markdownlint.json
(markdown linter config) - Update:
package.json
(add validation scripts) - Update:
docs/development/README.md
(document how to run validation locally) - Update: Project root
README.md
(add CI status badge)
Dependencies
This task has a few dependencies:
- Depends On:
- Task 001 (needs docs directory structure)
- Task 003 (needs OpenAPI spec to validate)
- Blocks:
- None (but improves all documentation tasks)
- Benefits All Tasks:
- Validates output from Tasks 002-007
- Prevents documentation quality regression
Effort Estimate
Hereâs a rough estimate of the effort required:
- Size: S-M
- Hours: 6-8 hours over 1 day
- Parallel: false (should be one of the last tasks)
- Breakdown:
- Workflow file creation: 2 hours
- Configuration files: 1.5 hours
- Local testing with
act
: 1 hour - Running validation on existing docs, fixing errors: 2 hours
- Documentation and README updates: 1 hour
- PR testing and refinement: 0.5 hours
Definition of Done
We'll consider this task done when the following criteria are met:
- [ ]
.github/workflows/docs-validation.yml
created and committed - [ ] All configuration files created (
.spectral.yaml
,.cspell.json
,.markdown-link-check.json
,.markdownlint.json
) - [ ] Workflow runs on pull requests (tested with actual PR)
- [ ] All validation checks pass on current documentation (zero errors)
- [ ] Local validation scripts added to
package.json
- [ ] Development guide updated with "How to validate docs locally"
- [ ] CI status badge added to main README
- [ ] Workflow fails appropriately (tested by intentionally breaking a check)
- [ ] Workflow provides clear error messages (not cryptic failures)
- [ ] Validation runs in <3 minutes (fast feedback loop)
Notes
Why This Is Critical
- Prevents bit rot: Keeps documentation current automatically.
- Enforces quality: Prevents merging PRs with broken links or invalid specs.
- Shift-left: Catches documentation issues before they reach the main branch.
- Reduces PR review burden: Automates checks, freeing reviewers to focus on content.
Testing Strategy
- Test each check independently (easier to debug).
- Intentionally break something to verify CI catches it.
- Run locally before pushing (fast feedback).
Common Issues
- False positive spell check: Add the word to
.cspell.json
. - External link 403: Add to
aliveStatusCodes
in.markdown-link-check.json
. - Mermaid diagram not rendering: Test at https://mermaid.live/ first.
Success Indicators
- CI catches broken link before merge (validated with test).
- CI fails when code changes without documentation updates (validated with test).
- CI runs in <3 minutes (good developer experience).
Quick Wins
- Add GitHub Actions cache to speed up npm installs.
- Run checks in parallel (fail-fast for speed).
- Provide actionable error messages (tell engineer how to fix).
Future Enhancements (Out of Scope for This Task)
- Auto-generate OpenAPI changelog on spec changes.
- Auto-generate Mermaid diagrams from code (e.g., database schema â ER diagram).
- Validate code examples in docs are syntactically correct.
References
- GitHub Actions Docs: https://docs.github.com/en/actions
- Spectral OpenAPI Linter: https://stoplight.io/open-source/spectral
- Markdown Link Check: https://github.com/tcort/markdown-link-check
- CSpell: https://cspell.org/
By setting up a CI/CD pipeline for documentation validation, you ensure that your documentation remains a valuable asset for your project. This automation not only improves the quality of your documentation but also streamlines the development process, making it easier for everyone to contribute and stay informed. So, let's get started, guys, and build a robust documentation validation system!