Skip to content

Simplicity Audit

Philosophy: Rich Hickey’s “Simple Made Easy” (Strange Loop 2011)

“Simplicity is a prerequisite for reliability.” — Dijkstra (via Hickey)

Simple = one role, one task, no interleaving of concerns. Complecting = braiding independent concerns together so they cannot be reasoned about or changed independently.


UFME’s VISION.md describes a sophisticated biometric system with several genuinely well-separated concerns (hexagonal architecture, stateless processing) and several areas where distinct concerns are complected together. The most significant complections are:

  1. Quality policy braided into the detection stage (algorithmic mechanism + business policy)
  2. Matching threshold braided into the aggregation stage (search mechanism + confidence policy)
  3. Metadata filtering braided into the FAISS index (storage + query policy)
  4. PAD (spoof detection) braided into the ingestion pipeline (security policy + data flow)
  5. Infrastructure concerns woven through domain descriptions

Section 1: System Architecture (Hexagonal Pattern)

Section titled “Section 1: System Architecture (Hexagonal Pattern)”
  • Ports and Adapters is fundamentally a simplifying pattern. The core domain speaks its own language; adapters translate. These are genuinely separate things.
  • Inbound vs Outbound adapters cleanly separate the “what comes in” from “what goes out.” No complection here.
  • The stated goal — “if the deployment replaces the upstream provider or the underlying network, the biometric core remains completely untouched” — is a correct expression of simplicity: the core is one thing, the transport is another.

Transaction state inside the Core Domain.

“Core Domain: Manages gallery partitioning, binning, filtering, and transaction state.”

Transaction state is the management of when and how a request evolves over time. Gallery partitioning and filtering are about what data to search. These are independent concerns braided into a single “Core Domain” description.

  • State + Routing Policy: transaction state (lifecycle management) is entangled with gallery partitioning (data segmentation policy). A transaction completing does not require knowledge of which gallery partition was searched — those are separate facts.
  • Mechanism + Policy: The core domain “manages” filtering. Management (mechanism) and the filter rules (policy) are different concerns. The rules about which gallery to search for an IABS vs IDENT1 request are policy — they belong in configuration or a policy layer, not inside the mechanism that executes the search.

Separate three distinct things:

  1. Request lifecycle (pure state machine: received → validated → dispatched → aggregated → responded)
  2. Routing/partitioning rules (pure data: a map of request-type → gallery-ids)
  3. Search execution (pure function: (query-vector, gallery-ids, k) → ranked-results)

The core domain should only know about (3). Rules in (2) are passed in as values. Lifecycle in (1) is managed by the inbound adapter or an orchestration layer.


  • Detection, Alignment, and Extraction are described as sequential stages — a pipeline of transforms. Each stage takes an image or intermediate artefact and produces the next. This is a clean composition of pure functions.
  • The ViT architecture choice (local patches → global attention) is an internal detail of one stage. It does not leak into adjacent stages. Simple.
  • ArcFace as a training loss is a concern of the training process, not the inference process. The VISION.md correctly notes it as a training detail, keeping it out of the runtime pipeline description.

1. Quality Assessment Policy braided into the Detection Stage.

“Quality Assessment: An auxiliary lightweight network calculates a quality score based on blur, illumination, and yaw/pitch/roll angles, rejecting sub-standard images before they consume heavy extraction compute.”

Two independent concerns are fused here:

  • Measurement: Computing the quality score (a number). This is a pure function: image → quality-score. It belongs in the pipeline as a transform.
  • Policy (Rejection): Deciding what “sub-standard” means and acting on it (drop the image, return an error, flag for human review). This is a business rule that will change. Different use cases (immigration search vs law enforcement) may have different quality thresholds.

By having the quality assessment stage reject images, the mechanism (measuring quality) is fused with the policy (acting on quality). You cannot change the rejection threshold without touching the measurement component, and you cannot reuse the measurement in a monitoring context without also triggering rejections.

Simple alternative: The quality stage produces a value ({:quality-score 0.73, :blur 0.1, :yaw 12.5}). A separate, configurable policy step decides what to do with that value. The pipeline becomes:

image → detect → align → measure-quality → [policy gate] → extract → vector

The [policy gate] is a pure function of (quality-value, policy-config) → pass | reject-with-reason. Policy lives in config, not in the measurement function.

2. PAD (Presentation Attack Detection) braided into Data Ingestion.

“An auxiliary AI model sits ahead of the feature extraction pipeline.”

PAD is a security policy. Feature extraction is a biometric mechanism. The VISION.md describes PAD as positioned “ahead of” the pipeline — which is better than being inside it — but the description conflates the two concerns spatially (“sits ahead of the feature extraction pipeline”) rather than treating PAD as an independent, composable check.

The complection: PAD’s position is hardcoded to be before extraction. But PAD could also be run asynchronously for audit purposes, or could be a separate service the pipeline calls out to. Describing it as “sitting ahead of” a specific stage couples its position to the pipeline topology.

Simple alternative: PAD is a function image → {:pass true} | {:reject true, :reason :deepfake}. It is called by the orchestration layer as one step in a sequence. Its position relative to other steps is determined by the orchestrator (configuration/data), not by the PAD component itself.


Section 3: Vector Storage & Matching Engine

Section titled “Section 3: Vector Storage & Matching Engine”
  • The memory mathematics (512 dims × 4 bytes × 200M = 400 GB) is a straightforward derivation from fixed facts. No complection.
  • The scatter-gather topology (fan-out to shards, aggregate results) is a clean description of mechanism. Fan-out and aggregation are treated as separate steps.
  • Inner Product vs Euclidean distance: the VISION.md correctly notes this is a mathematical equivalence enabled by L2-normalisation. One concern (the ViT’s output normalisation) enables a simplification in another concern (the distance metric). This is composition, not complection.

1. Confidence Threshold braided into the Aggregation Stage.

“The central node merges the 20 local lists, sorts them by Cosine Similarity, applies a dynamic confidence threshold, and returns the definitive global match.”

The aggregation step does three things:

  • Merge and rank: Pure function of lists → sorted list. Mechanism.
  • Apply threshold: Decide what score constitutes a “match.” Policy.
  • Return “definitive global match”: Implies a decision has been made. Business output.

The “Authority’s dynamic confidence threshold” is a policy value that will change (different operational modes, different legal standards, different risk appetites). By applying it inside the aggregation function, the aggregation mechanism is coupled to business policy. You cannot get the raw ranked list out of the aggregator without also accepting the policy interpretation.

Simple alternative: Aggregation returns a pure ranked list of (id, score) pairs. A downstream decision function takes (ranked-list, threshold-config) → match | no-match | candidates. The threshold is a value passed in, not embedded in the aggregator.

2. Metadata Filtering braided into the FAISS Index.

“Metadata (e.g., gender, age bracket, or recording event levels) is stored alongside the FAISS index IDs. The system uses pre-filtering bitsets to mask out non-relevant vectors before the distance calculation occurs.”

Two concerns fused:

  • Storage: Where metadata lives (alongside FAISS IDs).
  • Query policy: Which metadata attributes to filter on, and when.

Pre-filtering bitsets are a mechanism for efficient exclusion. The policy (which filters to apply for a given request type) is embedded in the construction of those bitsets at query time. If the operator adds a new metadata attribute (e.g., document type, nationality flag), the filtering logic — currently inside the FAISS interaction layer — must change.

Simple alternative: Metadata is a separate, queryable store. Before a search, a selector function takes (request-context, metadata-store) → bitset. This bitset is then passed as a pure value to the FAISS search function. The FAISS layer is only responsible for “search within this mask”; the policy for constructing the mask lives elsewhere.

3. Gallery Partitioning conflated with Physical Sharding.

“The FAISS cluster is logically segmented. An immigration search (IABS) does not unnecessarily scan the law enforcement (IDENT1) partitions unless explicitly instructed.”

Logical partitioning (which gallery does this request belong to?) is a business/policy concern. Physical sharding (which nodes hold which vectors?) is an infrastructure/performance concern. The VISION.md blurs these: “logically segmented” in the context of “physically distributed across 20 nodes.”

Complection risk: If the logical partition map and the physical shard map are the same thing, then changing the business rule (“IABS searches also need to check IDENT1”) requires a physical re-sharding of the cluster, or vice versa: physical rebalancing for performance reasons changes which logical partitions are searched. These should be independent decisions.

Simple alternative: Maintain two separate maps:

  • logical-partition → [vector-ids] (business/policy layer)
  • shard-id → [vector-ids] (infrastructure layer)

A query is resolved by first resolving logical partitions to vector IDs, then routing vector IDs to shards. Changes to business rules do not require infrastructure changes, and vice versa.


  • Inbound XML Translation: XML in, gRPC out. This is a pure translation adapter — one job. Simple.
  • The description of PAD as an “auxiliary AI model” correctly identifies it as a separable module.

The PAD module’s rejection behaviour is described as absolute.

“Rejecting malicious payloads instantly.”

The PAD component makes a binary security decision and acts on it (rejection). This fuses:

  • Detection: image → spoof-score (measurement, mechanism)
  • Classification: spoof-score → is-spoof? (policy: what threshold?)
  • Action: is-spoof? → reject (policy: what to do on detection?)

All three are independent. A PAD model that is 90% confident of a spoof may warrant: (a) immediate rejection in a high-security context, (b) flagging for human review in an audit context, or (c) logging only in a monitoring context. Hardcoding “reject” inside the PAD component eliminates these options without restructuring the component.


  • Stateless Processing is a genuine simplification: no mutable state persists between requests for raw imagery. The system treats images as values (ephemeral, immutable, passed through).
  • Docker + Kubernetes is a standard packaging/orchestration separation. Components remain unaware of their container context.
  • The ONNX Runtime CPU/GPU flexibility: the inference component is described as configurable at deployment time. The algorithm does not care about the hardware; the hardware is a deployment-time value.

Operational security requirements entangled with technical role descriptions.

Governance and security policy concerns were embedded in the technical architecture document, complecting “what the system does” with “who is authorised to operate it.” These are separate facts.

More subtly, support tier definitions (operational concern) were tied to security policy (access control concern). The access requirements depend on what data is accessed, not on what support activity is performed. A bug fix in a test environment has different access requirements than the same fix in production with live biometric data. The complection is support-activity + environment + data-sensitivity → access-requirement, but the original design presented it as support-activity → access-requirement.


AreaComplected ConcernsSimple Separation
Core DomainState lifecycle + routing policy + search mechanismThree separate components: state machine, policy map, search function
Quality AssessmentMeasurement + rejection policyMeasure as a value; policy gate as a separate configurable step
PADDetection + classification threshold + rejection actionThree pure functions composed by orchestrator
AggregationMerge/rank + confidence thresholdAggregator returns ranked list; decision function applies threshold
FAISS FilteringMetadata storage + filter policyMetadata store separate; selector function produces bitset as a value
Gallery PartitioningLogical segmentation + physical shardingTwo independent maps; resolved separately
Operational SecuritySupport activity + environment context + data sensitivityAccess requirements as a function of (activity, environment, data-sensitivity)

From Hickey: “If you want everything to be familiar, you have to make everything the same. That’s not simplicity.”

The path forward for UFME:

  1. Make policy a value, not a behaviour. Thresholds, rejection rules, partition routing — all should be data passed into pure functions, not logic embedded in components.
  2. Pipelines are composition, not objects. Each stage produces a value consumed by the next. No stage should cause side effects (like rejection) — it should produce a richer value that an orchestration layer acts on.
  3. Separate the what from the when. “What gallery to search” (policy) and “when to execute the search” (mechanism) are different concerns. Keep them apart.
  4. The aggregator should not decide. Aggregation produces ranked facts. Decision-making (match/no-match) is a separate, policy-driven function applied downstream.

This section tracks which original complections have been resolved in the current design.

ComplectionStatusResolutionDesign Doc
Transaction state inside Core Domain (state lifecycle + routing policy + search mechanism)ResolvedRequest lifecycle managed by orchestration layer; routing rules are policy data passed in; search is a pure function of (query, partitions, k). The route stage is split into a thin router + per-operation executors.Pipeline Design
Quality measurement braided with rejection policyResolvedQuality measurement (quality.py) produces a value dict; quality gate (quality_gate.py) is a separate configurable policy step. OFIQ satisfies QualityPort as pure measurement.Pipeline Design
PAD detection + classification + rejection fusedResolvedPAD measurement (pad.py) produces a PadScore value; PAD gate (pad_gate.py) applies configurable APCER/BPCER thresholds. MAD (mad.py) is a separate stage for morphing. All three are independent composable stages.Pipeline Design
Confidence threshold braided into aggregationResolvedAggregator returns a pure ranked list of (id, score) pairs. Threshold application is a separate threshold_check pure function in domain ops, called by the orchestration layer.Domain Design, FAISS Design
Metadata filtering braided into FAISS indexResolvedMetadata is a separate queryable concern. A selector function computes (request_context, metadata) -> bitset before search. The bitset is passed as a value to the FAISS search function. Filter policy lives outside the FAISS adapter.FAISS Design
Gallery partitioning conflated with physical shardingResolvedLogical partition map (partition_id -> [vector_ids]) and physical shard map (shard_id -> [vector_ids]) are separate. Partition is a data label resolved before routing to shards.FAISS Design
Infrastructure concerns in domain descriptionsResolvedDomain layer (src/core/) has zero infrastructure dependencies. Operational security, deployment topology, and infrastructure configuration are documented separately and do not appear in domain or pipeline code.Domain Design

A three-agent Rich Hickey review examined the implemented design documents (domain-design.md, pipeline-design.md, faiss-design.md) against the original audit’s principles. The first-pass findings above addressed the VISION.md’s high-level complections. This second pass inspects the concrete designs that resolved them — and finds nine new complections introduced or exposed during the detailed design phase.

The pattern is familiar: solutions to complections introduce their own complections at the next level of detail. As Hickey would say: “Every new thing you think you need is something else that can go wrong.”


Finding S2-1: Pipeline dicts accumulate all upstream fields (snowball pattern)

Section titled “Finding S2-1: Pipeline dicts accumulate all upstream fields (snowball pattern)”

Complected: Each stage’s input + cumulative history of all prior stages.

A stage at position N in the pipeline receives a dict containing every key produced by stages 0 through N-1. The stage cannot distinguish “my inputs” from “someone else’s outputs that happen to be in this dict.” Adding a new key in an early stage silently changes the shape of every downstream stage’s input. Stages are nominally independent but structurally coupled through a shared, growing data surface.

Resolution: Context/payload split. The pipeline runner manages the cumulative context dict. Each stage declares requires (the keys it reads) and produces (the keys it writes). The runner extracts only the required keys for each stage and merges the produced keys back into the context. Stages are decoupled from upstream output shapes — they see exactly what they declared, nothing more.


Finding S2-2: Error short-circuit logic distributed across every stage

Section titled “Finding S2-2: Error short-circuit logic distributed across every stage”

Complected: Error handling semantics + stage transformation logic.

If error detection is embedded in each stage (check for error envelope, decide whether to skip, pass through), then every stage is complected with error flow control. The error propagation mechanism is braided into the transformation logic. You cannot reason about a stage’s behaviour without also reasoning about error states from stages it has never heard of.

Resolution: The runner checks for error envelopes before calling each stage. If the context contains an error, the runner short-circuits and skips remaining stages. Stages never see errors. They are pure transformations of valid inputs to valid outputs.


Finding S2-3: Gate functions close over global config

Section titled “Finding S2-3: Gate functions close over global config”

Complected: Pure routing logic + mutable global state.

Gate functions (quality gate, PAD gate) that close over a global config object cannot be tested in isolation — they depend on whatever the global config happens to contain at call time. The gate’s logic (compare score to threshold) is simple; the complection is the invisible dependency on shared mutable state.

Resolution: Config values are injected via partial application at pipeline construction time. A gate is constructed as make_quality_gate(threshold=0.35), returning a pure closure. The gate function has no runtime dependency on global state. It is a value.


Finding S2-4: VectorStorePort merges read, write, and lookup

Section titled “Finding S2-4: VectorStorePort merges read, write, and lookup”

Complected: Search (latency-critical read) + mutation (eventually-consistent write) + point lookup.

A single VectorStorePort that bundles search(), add(), remove(), and get_by_id() forces every consumer to depend on capabilities it does not use. The search orchestrator, which only reads, carries a dependency on mutation methods. The enrol orchestrator, which only writes, carries a dependency on search methods. These are independent operational concerns with different consistency, latency, and scaling characteristics.

Resolution: Split into three ports:

  • VectorSearchPortsearch(query, k, partition, bitset) -> list[(id, score)]
  • VectorLookupPortget_by_id(id) -> template | None
  • VectorMutationPortadd(id, vector, partition), remove(id)

Each orchestrator depends only on the ports it actually uses. The FAISS adapter implements all three; consumers are decoupled from each other.


Finding S2-5: Pipeline dict shapes are unversioned/unvalidated

Section titled “Finding S2-5: Pipeline dict shapes are unversioned/unvalidated”

Complected: Data shape contract + runtime behaviour.

If the dict flowing between stages has no declared schema, the contract between stages is implicit — encoded only in the keys each stage happens to read and write. A typo in a key name, or a removed field, produces a silent KeyError at runtime rather than a clear contract violation at build/test time.

Resolution: TypedDict schemas defined per stage boundary. Each stage’s requires and produces declarations serve as the contract. Schema validation runs at test time to catch mismatches between declared contracts and actual stage implementations. The contract is data, not convention.


Finding S2-6: Event log contract not formalized as a port

Section titled “Finding S2-6: Event log contract not formalized as a port”

Complected: Ground-truth storage semantics + implementation choice.

The FAISS design describes event sourcing with an append-only log as the ground truth for gallery mutations. But if the event log is accessed directly (file I/O, Kafka client) rather than through a port, the orchestration layer is complected with the storage mechanism. Switching from file-backed to Kafka — or testing without either — requires changing the domain code.

Resolution: EventLogPort protocol defined with append(event), read_from(offset), and current_offset(). File-backed and Kafka implementations both satisfy the same contract. The domain depends on the protocol, never on the implementation.


Finding S2-7: trace_id injection at Respond stage unresolved

Section titled “Finding S2-7: trace_id injection at Respond stage unresolved”

Complected: Envelope/payload separation + respond stage needs.

The context/payload split (S2-1 resolution) cleanly separates envelope metadata (trace_id, timestamps) from stage payloads. But the Respond stage needs the trace_id to construct its output. If the stage cannot access envelope fields, it cannot do its job. If it can access all envelope fields, the separation is undermined.

Resolution: The Stage dataclass supports an inject_envelope_keys field for declarative envelope-to-payload injection at specific stages. The Respond stage declares inject_envelope_keys=["trace_id"]. The runner copies only those declared keys from the envelope into the stage’s input. The injection is explicit, minimal, and visible in the stage’s declaration — not a backdoor to the entire envelope.


Finding S2-8: Orchestrators call datetime.now() directly

Section titled “Finding S2-8: Orchestrators call datetime.now() directly”

Complected: Workflow logic + system clock.

An orchestrator that calls datetime.now() or time.time() directly cannot be tested deterministically. The workflow logic is simple; the complection is the hidden dependency on a non-deterministic external resource.

Resolution: ClockPort protocol injected into orchestrators. The production implementation returns UTC wall-clock time. Test implementations return deterministic, controllable timestamps. The orchestrator depends on a value supplier, not on the system clock.


Finding S2-9: Queue abstraction claimed as port but not formalized

Section titled “Finding S2-9: Queue abstraction claimed as port but not formalized”

Complected: Pipeline infrastructure + implicit coupling to asyncio.Queue.

The pipeline design describes inter-stage communication via queues, but if the queue type is asyncio.Queue used directly, every pipeline component is coupled to asyncio’s specific API and event loop model. Replacing asyncio queues with multiprocessing queues, or ZeroMQ, or an in-memory channel, requires changing the pipeline runner and potentially the stages.

Resolution: QueuePort protocol defined with put(item) and async __aiter__(). The pipeline runner depends on QueuePort; the concrete implementation (asyncio.Queue, multiprocessing, etc.) is an infrastructure adapter. Stages and the runner are decoupled from the queue mechanism.


#ComplectionStatusResolutionDesign Doc
S2-1Pipeline dict snowball (stage input + cumulative upstream history)ResolvedContext/payload split. Runner manages context; stages declare requires/produces.Pipeline Design
S2-2Error short-circuit in every stage (error handling + transformation logic)ResolvedRunner checks for error envelopes before calling stages. Stages never see errors.Pipeline Design
S2-3Gate functions close over global config (routing logic + mutable global state)ResolvedConfig injected via partial application at construction. Gates are pure closures.Pipeline Design
S2-4VectorStorePort merges read/write/lookup (search + mutation + point lookup)ResolvedSplit into VectorSearchPort, VectorLookupPort, VectorMutationPort.Domain Design
S2-5Pipeline dict shapes unvalidated (data shape contract + runtime behaviour)ResolvedTypedDict schemas per stage boundary. requires/produces declarations validated at test time.Pipeline Design
S2-6Event log contract not a port (storage semantics + implementation choice)ResolvedEventLogPort protocol: append, read_from, current_offset.Domain Design, FAISS Design
S2-7trace_id injection unresolved (envelope/payload separation + respond stage)ResolvedStage inject_envelope_keys for declarative, minimal envelope-to-payload injection.Pipeline Design
S2-8Orchestrators call datetime.now() (workflow logic + system clock)ResolvedClockPort protocol injected into orchestrators. Tests use deterministic clocks.Domain Design
S2-9Queue abstraction not formalized (pipeline infra + asyncio coupling)ResolvedQueuePort protocol: put(), aiter(). Runner depends on port, not implementation.Domain Design

A three-agent Rich Hickey review (Simplicity Analyst, Data & State Architect, System Architecture Reviewer) examined the design documents holistically. The second-pass findings above addressed concrete design complections. This third pass inspects the documentation consistency, operational gaps, and remaining data-orientation tensions across all design docs.


Finding S3-1: Orchestrator definitions diverge between pipeline-design.md and domain-design.md

Section titled “Finding S3-1: Orchestrator definitions diverge between pipeline-design.md and domain-design.md”

Complected: Two competing canonical definitions of the same component.

The pipeline-design.md contained a full “Orchestration Layer” section with orchestrator code examples using unsplit VectorStorePort and datetime.now(timezone.utc) — contradicting the resolved designs in domain-design.md (split ports, ClockPort). An implementer reading only pipeline-design.md would build against the stale design.

Resolution: The pipeline-design.md orchestration section now references domain-design.md Section 4 as canonical. Orchestrator code examples replaced with a port summary table and a functools.partial wiring pattern for injecting orchestrators into executor stages.


Finding S3-2: pad_gate complects PAD policy with operation routing

Section titled “Finding S3-2: pad_gate complects PAD policy with operation routing”

Complected: Spoof score evaluation (PAD concern) + ENROL-to-MAD routing (pipeline topology concern).

The original pad_gate checked spoof score and routed ENROL operations to MAD — two independent decisions in one function. Changing MAD requirements (e.g., requiring it for VERIFY) would require editing a PAD function.

Resolution: Split into two stages: pad_gate (pure PAD check: spoof score → respond or pass) and enrol_router (operation routing: ENROL → MAD, others → Detect). Each function has one responsibility. Pipeline Stage list updated with the new routing stage.


Finding S3-3: Runner context dict mutated in place

Section titled “Finding S3-3: Runner context dict mutated in place”

Complected: Value-oriented philosophy + place-oriented runner implementation.

The runner used ctx.update(output) and ctx.pop(key, None) — in-place mutation contradicting the “values over places” principle applied everywhere else. Latent concurrency hazard if the pipeline ever supports fan-out.

Resolution: Runner now uses ctx = {**ctx, **output} (dict merge producing new dict) and ctx = {k: v for k, v in ctx.items() if k not in stage.drops} (filtering producing new dict). Context is a value at every stage boundary.


Finding S3-4: EnrolResult.replaced reports intent, not outcome

Section titled “Finding S3-4: EnrolResult.replaced reports intent, not outcome”

Complected: Caller’s request intent + factual result assertion.

EnrolResult(replaced=request.replace_existing) set the result from the request flag, not from the actual outcome. Whether replacement occurred is a fact only the vector store knows.

Resolution: Added replaced: bool to EnrolAck. The orchestrator now uses replaced=ack.replaced — the result reflects what actually happened.


Finding S3-5: Event types diverge between domain-design.md and faiss-design.md

Section titled “Finding S3-5: Event types diverge between domain-design.md and faiss-design.md”

Complected: Two definitions of the same event types with different field names and shapes.

Domain EnrolEvent lacked the vector field (needed for index rebuilds), used timestamp vs FAISS’s enrolled_at, and used source_image_hash vs source_ref. DeleteEvent lacked reason_code.

Resolution: Domain event types updated to be canonical: enrolled_at, source_ref, vector field added to EnrolEvent, reason_code added to DeleteEvent. FAISS design doc references domain types as authoritative, with Rust structs derived via protobuf translation.


Finding S3-6: No backpressure design for queue pipeline

Section titled “Finding S3-6: No backpressure design for queue pipeline”

Complected: Queue decoupling benefit + unbounded memory growth risk.

Hickey advocates queues for decoupling, but unbounded queues are a liability. Extract at 120ms is the bottleneck; without bounded queues, upstream stages fill memory under sustained 1,900 searches/min load.

Resolution: Added backpressure section to pipeline-design.md with bounded queue sizes per stage boundary, saturation behaviour (producer blocks), and inbound adapter returning HTTP 503 / gRPC RESOURCE_EXHAUSTED when receive queue is full.


Finding S3-7: Python/Rust wire format unspecified

Section titled “Finding S3-7: Python/Rust wire format unspecified”

Complected: Design completeness + implementation ambiguity at the serialisation boundary.

The protobuf schema for Python↔Rust communication (especially float32 vector encoding) was not specified. This is where precision bugs and endianness issues hide.

Resolution: Added wire format specification to faiss-design.md: bytes field type for vectors (raw little-endian f32), protobuf schema snippet for key messages, numpy .tobytes()/frombuffer() serialisation. Referenced from ARCHITECTURE.md.


Finding S3-8: No accretion policy for stage dict shapes

Section titled “Finding S3-8: No accretion policy for stage dict shapes”

Complected: Evolving stage contracts + silent consumer breakage.

No policy for how stage output shapes evolve. A renamed key breaks consumers silently.

Resolution: Added accretion policy to pipeline-design.md: “stages may add new keys but must never rename or remove existing keys” — the “provide more, require less” principle.


Finding S3-9: TypedDict schemas and requires/produces are redundant specifications

Section titled “Finding S3-9: TypedDict schemas and requires/produces are redundant specifications”

Complected: Two independent specs of the same contract that can drift.

TypedDict definitions and Stage requires/produces frozensets specify the same information separately.

Resolution: Added schema derivation note to pipeline-design.md: TypedDicts should be generated from requires/produces declarations at test time, eliminating the drift risk.


Finding S3-10: Executor stage dependency injection unspecified

Section titled “Finding S3-10: Executor stage dependency injection unspecified”

Complected: Stage function signature (dict -> dict) + orchestrator dependency that has no injection path.

Executor stages needed to call domain orchestrators, but the stage function signature provided no mechanism for dependency injection. Gates used functools.partial for config, but executors had no equivalent pattern.

Resolution: Executor stages now use the same functools.partial pattern: partial(search_executor, search_orch) at construction time. Documented in pipeline-design.md Translation Boundary section.


Finding S3-11: biometric.py bundles all 12+ ports in a single file

Section titled “Finding S3-11: biometric.py bundles all 12+ ports in a single file”

Complected: Four independent concern families (operations, vector store, inference, infrastructure) in one module.

ClockPort and QueuePort have nothing to do with biometrics. A change to QueuePort opens the same file as VectorSearchPort. This is “easy” (one import) but not “simple” (multiple concerns in one module).

Recommendation (not yet resolved): Split into concern-aligned modules during Phase 1 implementation:

  • ports/operations.py — Search, Verify, Enrol, Delete
  • ports/vector_store.py — VectorSearch, VectorLookup, VectorMutation
  • ports/inference.py — Inference, Quality, PAD, MorphingDetection
  • ports/infrastructure.py — EventLog, Queue, Clock

#ComplectionStatusResolutionDesign Doc
S3-1Orchestrator definitions diverge (two competing canonicals)Resolvedpipeline-design.md references domain-design.md as canonicalPipeline Design, Domain Design
S3-2pad_gate complects PAD + routing (two concerns in one gate)ResolvedSplit into pad_gate + enrol_router as separate stagesPipeline Design
S3-3Runner context mutated in place (values vs places)ResolvedDict spread and comprehension; context is a value at every boundaryPipeline Design
S3-4EnrolResult.replaced reports intent (request flag vs actual outcome)ResolvedEnrolAck.replaced added; orchestrator uses ack.replacedDomain Design
S3-5Event types diverge (domain vs FAISS definitions)ResolvedDomain types canonical; unified field names; vector field addedDomain Design, FAISS Design
S3-6No backpressure (unbounded queues under load)ResolvedBounded queue sizes, saturation policy, 503/RESOURCE_EXHAUSTEDPipeline Design
S3-7Wire format unspecified (Python/Rust boundary)ResolvedProtobuf schema with bytes fields, LE f32, numpy serialisationFAISS Design, ARCHITECTURE.md
S3-8No accretion policy (stage shapes can break silently)Resolved”Provide more, require less” policy documentedPipeline Design
S3-9Redundant TypedDict/requires specs (two specs, one contract)ResolvedSchema derivation from requires/produces at test timePipeline Design
S3-10Executor DI unspecified (no injection path for orchestrators)Resolvedfunctools.partial pattern for executor stagesPipeline Design
S3-11biometric.py bundles all ports (four concern families in one file)OpenRecommended split into 4 concern-aligned modules during Phase 1

First-pass audit (VISION.md): 7 complections identified, 7 resolved. Second-pass audit (design docs): 9 complections identified, 9 resolved. Third-pass audit (cross-doc review): 11 complections identified, 10 resolved, 1 open recommendation. Total: 27 complections identified, 26 resolved, 1 open.

PassAreaComplected ConcernsSimple Separation
1stCore DomainState lifecycle + routing policy + search mechanismThree separate components: state machine, policy map, search function
1stQuality AssessmentMeasurement + rejection policyMeasure as a value; policy gate as a separate configurable step
1stPADDetection + classification threshold + rejection actionThree pure functions composed by orchestrator
1stAggregationMerge/rank + confidence thresholdAggregator returns ranked list; decision function applies threshold
1stFAISS FilteringMetadata storage + filter policyMetadata store separate; selector function produces bitset as a value
1stGallery PartitioningLogical segmentation + physical shardingTwo independent maps; resolved separately
1stOperational SecuritySupport activity + environment context + data sensitivityAccess requirements as function of (activity, environment, data-sensitivity)
2ndPipeline dictsStage input + cumulative upstream historyContext/payload split; stages declare requires/produces
2ndError handlingError flow control + stage transformation logicRunner owns error short-circuit; stages are pure transforms
2ndGate configPure routing logic + mutable global statePartial application at construction; gates are pure closures
2ndVectorStorePortSearch + mutation + point lookupThree separate ports: Search, Lookup, Mutation
2ndDict schemasData shape contract + runtime behaviourTypedDict schemas; requires/produces validated at test time
2ndEvent logStorage semantics + implementation choiceEventLogPort protocol with append/read_from/current_offset
2ndtrace_id injectionEnvelope/payload separation + respond stage needsDeclarative inject_envelope_keys on Stage dataclass
2ndClock dependencyWorkflow logic + system clockClockPort protocol; deterministic in tests
2ndQueue abstractionPipeline infra + asyncio couplingQueuePort protocol; implementation is an adapter
3rdOrchestrator docsTwo competing canonical definitionspipeline-design.md references domain-design.md as canonical
3rdpad_gatePAD policy + operation routing in one functionSplit into pad_gate + enrol_router stages
3rdRunner contextValue philosophy + place-oriented mutationDict spread; context is a value at every boundary
3rdEnrolResult.replacedRequest intent + factual outcomeEnrolAck.replaced; orchestrator uses actual outcome
3rdEvent typesDomain vs FAISS definitions divergeDomain types canonical; unified names; vector field added
3rdBackpressureQueue decoupling + unbounded memory growthBounded queues, saturation policy, 503 responses
3rdWire formatDesign completeness + serialisation ambiguityProtobuf schema, LE f32 bytes, numpy serialisation
3rdAccretion policyEvolving contracts + silent breakage”Provide more, require less” documented
3rdSchema redundancyTwo specs of one contractDerive TypedDicts from requires/produces
3rdExecutor DIStage signature + missing dependency pathfunctools.partial for executor stages
3rdPort file organisationFour concern families in one moduleRecommended 4-file split (open)