ADR 0036: Cyclomatic-complexity ratchet to rank C + promote-to-required gate¶
- Status: Accepted
Context¶
ADR 0035 landed the non-blocking
quality gates — radon / xenon for complexity, jscpd for duplication,
vulture for dead code — with thresholds calibrated against the
v1.0.2 codebase. The xenon threshold was set to
--max-absolute E --max-modules C --max-average A: deliberately
permissive so the baseline passed without forcing any refactors.
ADR 0035 also documented the promotion-to-required pattern as a
follow-up PR once the baselines settled.
The project owner asked to tighten the absolute ceiling from rank E
(complexity ≤ 40) to rank C (complexity ≤ 20) as the standing
standard. The reasoning: rank D-E functions in the codebase are
mostly "type-dispatch grew accumulated cruft" — the kind of
complexity that a behavior-preserving dispatch-table refactor
collapses, not domain-shaped complexity that the dispatch pattern
hides. ADR 0035 had already noted that ruff's intentionally-ignored
C901 / PLR0911 / PLR0912 (the inline complexity rules) were
suppressed because per-function noqa was noisy, NOT because the
underlying complexity was justified — so a refactor pass that
brings everything under rank C and then enforces it absolutely is
the lower-friction path forward.
Baseline audit (against main post-PR-#44)¶
radon cc src/bqemulator -nD -s returned 10 functions across 8
modules. 3 at rank E, 7 at rank D:
| # | File | Line | Function | Rank | CC |
|---|---|---|---|---|---|
| 1 | storage/arrow_bridge.py |
159 | _format_bq_value |
E | 39 |
| 2 | storage/arrow_bridge.py |
362 | _coerce_to_arrow_value |
E | 33 |
| 3 | jobs/executor.py |
1361 | classify_statement_type |
E | 31 |
| 4 | catalog/memory_repository.py |
89 | MemoryCatalogRepository.delete_dataset |
D | 26 |
| 5 | streaming/avro_serializer.py |
193 | _arrow_dtype_to_avro |
D | 26 |
| 6 | types/interval.py |
153 | _consume_blocks |
D | 24 |
| 7 | scripting/parser.py |
153 | Parser._parse_statement |
D | 23 |
| 8 | api/routes/jobs.py |
901 | _apply_write_append |
D | 23 |
| 9 | api/routes/tables.py |
149 | _rest_to_table_meta |
D | 22 |
| 10 | scripting/parser.py |
647 | Parser._read_type_name |
D | 21 |
xenon --max-absolute C --max-modules C --max-average A src/bqemulator
exit 1 against this baseline.
Decision¶
- Refactor every rank-D / rank-E function above to rank ≤ C — behavior-preserving, no semantic changes, no new tests beyond what the existing suite already exercises. Each refactor follows one of three patterns:
| Bucket | Pattern | Used by |
|---|---|---|
| A Clean refactor | Extract sub-blocks as named helpers; main function reads as a sequence of named steps. | delete_dataset, _consume_blocks, _apply_write_append, _rest_to_table_meta, Parser._read_type_name |
| B Dispatch table | Long if/elif chain over typed predicates → tuple[(predicate, handler)] + a loop in the caller. |
_format_bq_value, _coerce_to_arrow_value, classify_statement_type, _arrow_dtype_to_avro, Parser._parse_statement |
| C Irreducibly branchy | Documented xenon --exclude carve-out + ADR-recorded "this is domain-shaped" reasoning. |
none — every function in the audit list fell into bucket A or B |
No bucket-C exclusions are needed. The original ADR-0035 framing anticipated up to ~20% of the list might be irreducible; the actual rate is 0%.
-
Tighten the Makefile gate to
xenon --max-absolute C --max-modules C --max-average A. Today's refactored codebase passes; any new function above rank C either (a) gets a dispatch-table or helper-extraction refactor, or (b) ships a separate PR adding a--excludepath pattern with a new ADR section justifying the irreducibility verdict. -
Promote the gate to required:
-
Add
make quality-complexityto themake verifychain. - Drop
continue-on-error: truefrom the cyclomatic-complexity step in.github/workflows/code-quality.yml. The duplication and dead-code steps keep their non-blocking status — those gates get their own promote-to-required ratchet PRs when their baselines settle. -
Add the
Quality gatescheck job to the branch-protection ruleset'srequired_status_checks.contextslist (ruleset id16726422). -
Stable job name — rename the workflow's job from
Quality gates (non-blocking)to plainQuality gates. Branch protection matches by job name; keeping the name short and blocking-status-agnostic means duplication / dead-code can promote later without invalidating the required-check entry.
Refactor results¶
Per-function before / after, captured live during the work:
| Function | Before | After |
|---|---|---|
_format_bq_value |
E (39) | C (14) |
_coerce_to_arrow_value |
E (33) | B (7) |
classify_statement_type |
E (31) | B (9) |
delete_dataset |
D (26) | B (6) |
_arrow_dtype_to_avro |
D (26) | A (4) |
_consume_blocks |
D (24) | A (2) |
_parse_statement |
D (23) | A (5) |
_apply_write_append |
D (23) | B (10) |
_rest_to_table_meta |
D (22) | B (8) |
_read_type_name |
D (21) | C (13) |
Project-wide average complexity stays at rank A (around 3.4).
Module averages stay ≤ rank C. radon cc -nD -s is now empty.
The new helper functions (one per extracted sub-block; one per dispatch-table handler) all land at rank A or low B — the helpers' cumulative complexity is roughly equal to the original function's because cyclomatic complexity is additive across called functions when counted with radon's per-function model, but the dispatch pattern makes the BRANCH STRUCTURE local rather than nested, which is what the gate is calibrated for.
Rationale¶
Why rank C, not D¶
Rank D (complexity 21-30) covers functions that need refactoring today on most code-quality scoring rubrics (radon's documented recommendation is "consider refactoring at rank C"). Rank C marks the threshold where the function is "still grokkable" — testable without coverage holes, refactorable without changing semantics. Anchoring the absolute ceiling at C aligns the gate with the project-wide judgment that type-dispatch is fine as a pattern but not as inline branching: the dispatch should be in a table, not a procedural chain.
Why no bucket-C exclusions¶
The audit list was uniformly amenable to dispatch-table or
helper-extraction refactors. The historical framing
(ADR 0035 "if you find yourself reaching for bucket C more than
~20% of the list, you're probably wrong about the bucketing")
turned out to be conservative. The bucket-C escape hatch stays
available — if a future PR adds genuine domain-shaped complexity
(e.g. a SQL grammar parser, a constraint-solver dispatch), the
contributor adds a --exclude plus an ADR section. The bar is
"another maintainer reading the function would agree the
complexity is domain-shaped, not accumulated."
Why promote in the same PR as the refactors¶
Two reasons:
-
Test-then-refactor hygiene works at the gate level too. With the gate at rank E for non-blocking, a regressing function wouldn't even be flagged in CI — it'd silently pass. Tightening the gate without the refactors first would have failed CI on day one. Tightening after the refactors land but without promoting it to required means a new D-rank function could merge against the still-passing non-blocking gate. Atomic atomicity wins.
-
Branch protection lists by job name. The "Quality gates" job has run on every PR since PR #44; renaming it to add the "(required)" suffix or splitting it into two jobs would invalidate the existing job-name match the ruleset will eventually depend on. Keep the stable name, flip the
continue-on-errorflag, and add to required-checks in one atomic step.
Why duplication / dead-code stay non-blocking¶
Both have a different cost/benefit shape:
- jscpd duplication baseline is 0.36% (seven clones, 112 lines, all 11-22 lines and structurally template-shaped — see ADR 0035). Refactoring those would introduce premature abstraction; the gate is calibrated to catch new drift, not to eliminate the baseline. The threshold (1.0%) catches regression but doesn't force the cleanup.
- vulture dead-code finds one false positive (the
reserved-for-future
use_cachekwarg, already whitelisted). There's nothing to refactor against; promoting to required is effectively a no-op on the current state.
Each gets its own ratchet PR if a more aggressive posture becomes useful. For now: the noise is too low to justify the merge-blocking cost.
Consequences¶
Positive¶
- The codebase's worst function complexity drops from 39 to 14. Every new function or modified function has a per-PR signal at rank C — drift past that fails CI loudly.
- The bucket-A / bucket-B refactor patterns are now documented in
the helpers' docstrings (see e.g.
arrow_bridge._fmt_bq_*/arrow_bridge._coerce_arrow_*/executor._classify_*/tables._rest_to_*) so future contributors have a template. - The "type-dispatch is naturally branchy" ruff-ignore rationale in
pyproject.toml stays accurate — but it now reads as "ruff's
inline
C901is noisy AND the project enforces a stricter rank ceiling externally via xenon." Two checks; one is local per-call, the other is project-wide structural.
Negative¶
- The branch-protection ruleset now has another required check; a PR can no longer merge with a complexity regression. New contributors who introduce a rank-D function need to refactor before merge. The Makefile target's docstring + ADR pointer documents the expected pattern (dispatch table or helper extraction) so the friction is bounded.
- The
make verifychain now runs xenon as part of the pre-commit gate. Local cost: ~1 second. CI cost: same step but no longer wrapped incontinue-on-error— so a flake there fails the workflow. xenon has no known flakiness modes (it's deterministic against the radon AST output), so this is a theoretical concern.
Neutral¶
- Helper-function count grew by ~30 across 6 source files. Code size diff is roughly +475 / -316 net + several added module- level dispatch tables. Test count is unchanged (3127 passed + 1 expected skip both before and after).
- The duplication and dead-code gates' continue-on-error wrappers stay. Until each promotes, the workflow summary's outcomes table makes the blocking / non-blocking split explicit.
Alternatives considered¶
- Refactor in waves, gate-promote in a separate PR. Cleaner for review but invites drift between landing and enforcement — a regressing function merged in between would silently pass.
- Per-function
# xenon: ignoreannotations. xenon doesn't support per-function ignores natively. Adding the metadata would require either a fork or a pre-/post-processor; both are higher cost than the path-level exclusion pattern this ADR adopts. - Tighten to rank B (complexity ≤ 10). Too strict — many structurally reasonable Python functions land at rank B-C (e.g. argument validators, REST request normalisers). The rank C threshold matches the "still grokkable" bar without forcing rewrites of perfectly clear code.
- Use pylint's
max-complexityinstead. pylint is not in the project's lint chain (ruff covers everything pylint does on this codebase plus more). Adding pylint just for one metric duplicates infrastructure. - Switch to a different complexity metric (Halstead, cognitive). radon supports Halstead and maintainability index but the team familiarity is with cyclomatic. The rank scale is documented and standard. Reconsider if cyclomatic ever proves to miss real complexity (it sometimes does for callback-heavy code, but this codebase is procedural).
References¶
- ADR 0035 — the parent decision introducing the gates; this ADR tightens one of them.
- radon documentation — rank-tier definitions (A: 1-5, B: 6-10, C: 11-20, D: 21-30, E: 31-40, F: ≥41).
- xenon documentation —
CLI threshold semantics (
--max-absolute,--max-modules,--max-average). - AGENTS.md "Pre-commit gate (mandatory)" — the contract this ADR
extends by adding
make quality-complexitytomake verify.
Update (2026-05-28): follow-up ratchet to rank B in ADR 0041¶
The C-ratchet ran for ~6 months and ADR 0036's "Alternatives considered" §3 rejection of an immediate rank-B ceiling has been revisited.
The original verdict — "many structurally reasonable Python functions
land at rank B-C" — held only until a sustained refactor pass tested
it empirically. The PR-1 through PR-11 sweep
(#90 through
#102) closed
every remaining rank-C function in src/ (~60 functions across 26
files) using the same bucket-A (helper extraction) and bucket-B
(dispatch table) patterns this ADR established for the D/E sweep.
The bucket-C "irreducible domain complexity" escape hatch — predicted
to absorb up to ~20% of refactor targets — stayed at 0 hits, just
like it did for the D/E audit.
ADR 0041 ratchets the
--max-absolute ceiling from rank C to rank B
(xenon --max-absolute B --max-modules C --max-average A) on top
of this ADR's foundations. The two ADRs co-exist:
- This ADR (0036) remains in effect for everything except the per-function absolute ceiling — the gate's promotion-to-required status, the bucket A/B/C taxonomy, the duplication / dead-code gates' non-blocking posture, the rationale for keeping the project-wide average at rank A and module average at rank C.
- ADR 0041 layers a tighter
--max-absoluteon top after the PR-1…PR-11 sweep demonstrated empirically that the rank-C bucket was uniformly refactorable.
ADR 0036 is not superseded — it's the ground floor that ADR 0041 builds on.