Request Flow Walkthrough
This page traces a request through every layer of the system. If you are new to UFME, read the Glossary first for term definitions.
System overview
Section titled “System overview”At a high level, UFME has three layers: the REST gateway (accepts HTTP requests), the pipeline (processes the face image through a series of stages), and the vector store (searches or mutates the gallery).
flowchart LR Client -->|multipart/form-data| Gateway Gateway -->|XML envelope| Pipeline Pipeline -->|template vector| VectorStore VectorStore -->|ranked candidates| Pipeline Pipeline -->|XML response| Gateway Gateway -->|HTTP response| Client1:N search — step by step
Section titled “1:N search — step by step”This is the most common operation. The client submits a photo and receives a ranked list of matching identities.
Step 1: REST gateway
Section titled “Step 1: REST gateway”The gateway receives a POST /api/v1/search request with an image file and JSON metadata. It:
- Validates the request (content type, image presence, metadata format)
- Generates a unique
trace_idfor this request - Translates the request into an XML envelope (the pipeline’s internal format)
- Pushes the envelope onto the
receive_q(an asyncio queue) - Creates an
asyncio.Futurekeyed bytrace_idand waits for the pipeline to resolve it
Step 2: Pipeline stages
Section titled “Step 2: Pipeline stages”The pipeline runner pulls the envelope from receive_q and passes it through stages. Each stage is a pure async function that takes a dict and returns a modified dict.
flowchart TD Receive["Receive\n(parse XML envelope)"] Detect["Detect\n(SCRFD: find faces + landmarks)"] Align["Align\n(affine warp to 112x112)"] PAD["PAD Gate\n(MiniFASNetV2: spoof check)"] Quality["Quality Gate\n(eDifFIQA: ISO quality score)"] Extract["Extract\n(ArcFace: 512-dim embedding)"] Search["Search\n(FAISS: top-K nearest neighbours)"] Respond["Respond\n(build XML response)"]
Receive --> Detect Detect --> Align Align --> PAD PAD -->|pass| Quality PAD -->|fail: spoof detected| Respond Quality -->|pass| Extract Quality -->|fail: below threshold| Respond Extract --> Search Search --> Respond
style PAD fill:#fff3cd style Quality fill:#fff3cdGates (yellow) can short-circuit the pipeline. If PAD detects a spoof, processing stops immediately and an error response is returned — the image never reaches extraction or search.
Optional stages (not shown above) may be inserted depending on which model files are present: super-resolution (before detect), head pose estimation (after align), deepfake detection (after align), age estimation (after align), and morphing detection (after quality, enrol only).
Step 3: Vector search
Section titled “Step 3: Vector search”The search stage sends the 512-dim template to the vector store. In sharded mode, this is a scatter-gather operation:
flowchart TD API["API process"] S0["Shard 0"] S1["Shard 1"] S2["Shard 2"] S3["Shard 3"] S4["Shard 4"] Merge["Merge + rerank"]
API -->|"query vector (gRPC)"| S0 API -->|"query vector (gRPC)"| S1 API -->|"query vector (gRPC)"| S2 API -->|"query vector (gRPC)"| S3 API -->|"query vector (gRPC)"| S4 S0 -->|"top-50 PQ candidates"| Merge S1 -->|"top-50 PQ candidates"| Merge S2 -->|"top-50 PQ candidates"| Merge S3 -->|"top-50 PQ candidates"| Merge S4 -->|"top-50 PQ candidates"| Merge Merge -->|"top-K exact reranked"| APIEach shard scans its local IVF-PQ index (probing nprobe cells) and returns up to local_k candidates. The API merges all candidates, optionally reranks with full-precision vectors, and returns the top K.
If a shard does not respond within deadline_seconds, the partial_result_policy determines behaviour: annotate (return results with a warning), reject (fail the request), or degrade (return partial results silently).
Step 4: Response
Section titled “Step 4: Response”The respond stage builds the XML response, sets the trace_id on the gateway’s asyncio.Future, and the gateway returns the HTTP response to the client.
Enrol flow
Section titled “Enrol flow”Enrolment runs a stricter pipeline than search because the template will persist in the gallery.
flowchart TD Receive --> Detect Detect --> Align Align --> PAD PAD -->|pass| MAD["MAD Gate\n(morphing detection)"] PAD -->|fail| Respond MAD -->|pass| Quality MAD -->|fail: morphed image| Respond Quality -->|pass| Extract Quality -->|fail| Respond Extract --> Enrol["Enrol\n(store template in gallery)"] Enrol --> EventLog["Event Log\n(append EnrolEvent)"] EventLog --> Respond
style PAD fill:#fff3cd style MAD fill:#fff3cd style Quality fill:#fff3cdKey differences from search:
- MAD gate is active (morphing detection) — prevents blended identity photos from entering the gallery
- The template is stored in the vector index, not used for a query
- An EnrolEvent is appended to the event log (immutable audit trail)
Verify flow (1:1)
Section titled “Verify flow (1:1)”Verification compares a probe against a single enrolled subject, rather than searching the entire gallery.
flowchart TD Receive --> Detect Detect --> Align Align --> PAD PAD -->|pass| Quality PAD -->|fail| Respond Quality -->|pass| Extract Quality -->|fail| Respond Extract --> Lookup["Lookup\n(fetch subject's template)"] Lookup --> Compare["Compare\n(cosine similarity)"] Compare --> Respond
style PAD fill:#fff3cd style Quality fill:#fff3cdThe pipeline extracts a template from the probe, fetches the enrolled template for the given subject_id, computes cosine similarity, and returns match/no-match with the score.
Delete flow
Section titled “Delete flow”Deletion does not require an image.
flowchart TD Receive --> Delete["Delete\n(remove from vector index)"] Delete --> EventLog["Event Log\n(append DeleteEvent)"] EventLog --> RespondThe template is removed from the FAISS index and a DeleteEvent is appended to the event log. The event log is append-only: deletions are recorded but old enrolment events are never removed, providing a full audit trail.
Timing breakdown
Section titled “Timing breakdown”The respond stage includes per-stage timing in the XML response (as a <timing> element). This is useful for identifying bottlenecks:
<timing> <detect>12.3</detect> <align>1.2</align> <pad>8.7</pad> <quality>5.1</quality> <extract>15.4</extract> <search>2.1</search></timing>Values are in milliseconds. In a typical search on CPU, detection (12—20 ms) and extraction (15—25 ms) are the most expensive stages. FAISS search is usually under 3 ms even at 200M scale.