GCP / Terraform
The Terraform configuration at deploy/terraform/ provisions the GCP infrastructure for the UFME 200M face search benchmark. It creates 5 FAISS shard VMs and 1 orchestrator VM in a security-hardened network configuration: no external IP addresses, outbound internet via Cloud NAT, SSH via Identity-Aware Proxy (IAP) tunnel.
Infrastructure overview
Section titled “Infrastructure overview”GCP project (europe-west2-a)│├── Networking│ ├── Cloud Router (ufme-router)│ ├── Cloud NAT (ufme-nat) ← outbound internet for VMs│ └── Firewall rules│ ├── ufme-allow-grpc ← port 50051, shard tag only│ └── ufme-allow-iap-ssh ← port 22, IAP source range 35.235.240.0/20│├── GCS bucket (ufme-benchmark-data) ← benchmark data transfer│├── Service account (ufme-shard) ← storage-ro access to GCS│├── Shard VMs × 5 (ufme-shard-0 … ufme-shard-4)│ ├── Machine: n2-highmem-16 (16 vCPUs, 128 GB RAM)│ ├── Disk: 150 GB pd-balanced│ ├── OS: Ubuntu 22.04 LTS│ └── Tags: ufme-shard│└── Orchestrator VM × 1 (ufme-orchestrator) ├── Machine: e2-standard-4 (4 vCPUs, 16 GB RAM) ├── Disk: 50 GB pd-balanced ├── OS: Ubuntu 22.04 LTS └── Tags: ufme-orchestratorResources
Section titled “Resources”Networking
Section titled “Networking”Cloud Router + Cloud NAT — VMs have no external IP addresses. Outbound traffic (package downloads, GCS transfers) routes through Cloud NAT via the Cloud Router. This eliminates public attack surface on all VMs.
Firewall: ufme-allow-grpc — permits TCP 50051 from VMs tagged ufme-orchestrator to VMs tagged ufme-shard. The benchmark runner fans gRPC requests out to all shards over this path.
Firewall: ufme-allow-iap-ssh — permits TCP 22 from the IAP source range 35.235.240.0/20 to both shard and orchestrator VMs. SSH sessions are tunnelled through Google’s IAP service with IAM-based access control; no VPN or bastion host is required.
GCS bucket
Section titled “GCS bucket”A GCS bucket (ufme-benchmark-data) is provisioned for benchmark data transfer — primarily the tiled MS1MV3 vectors uploaded from a local machine and downloaded by the shard VMs. The bucket has a 30-day lifecycle deletion rule.
Shard VMs
Section titled “Shard VMs”| Property | Value |
|---|---|
| Count | 5 (configurable via shard_count) |
| Machine type | n2-highmem-16 |
| vCPUs | 16 |
| RAM | 128 GB |
| Disk | 150 GB pd-balanced |
| OS | Ubuntu 22.04 LTS |
128 GB RAM accommodates both the IVF-PQ compressed index (~2.56 GB for 40M vectors at 64 B/vector) and the full-precision stored vectors for reranking (~82 GB at 40M × 2,048 B). The 150 GB disk holds the FAISS binary, Python dependencies, and snapshot files.
Orchestrator VM
Section titled “Orchestrator VM”| Property | Value |
|---|---|
| Machine type | e2-standard-4 |
| vCPUs | 4 |
| RAM | 16 GB |
| Disk | 50 GB pd-balanced |
The orchestrator runs the benchmark runner (scripts/run_benchmark.py), which fans gRPC search requests across all 5 shards, aggregates results, and posts metrics to the benchmark UI.
Estimated cost
Section titled “Estimated cost”| Component | Count | Hourly unit | Hourly total |
|---|---|---|---|
| n2-highmem-16 (on-demand) | 5 | ~$1.18 | ~$5.90 |
| e2-standard-4 (on-demand) | 1 | ~$0.13 | ~$0.13 |
| Total | ~$6.03/hr |
Stop VMs when not in use — terraform destroy removes all compute resources. GCS storage (~$0.02/GB/month) is the only ongoing cost between benchmark runs.
Configuration variables
Section titled “Configuration variables”variable "project_id" # GCP project ID (required, no default)variable "region" # default: "europe-west2"variable "zone" # default: "europe-west2-a"variable "shard_count" # default: 5variable "shard_machine_type" # default: "n2-highmem-16"variable "orchestrator_machine_type" # default: "e2-standard-4"variable "shard_disk_size_gb" # default: 150variable "orchestrator_disk_size_gb" # default: 50variable "gcs_bucket_name" # default: "ufme-benchmark-data"variable "repo_branch" # default: "main"variable "total_vectors" # default: 200000000Override variables via terraform.tfvars:
project_id = "your-gcp-project-id"shard_count = 3 # reduce for cost during developmentregion = "europe-west2"Outputs
Section titled “Outputs”After terraform apply, the following outputs are available:
terraform output shard_internal_ips # [10.x.x.1, 10.x.x.2, ...]terraform output shard_names # [ufme-shard-0, ufme-shard-1, ...]terraform output orchestrator_internal_ip # 10.x.x.10terraform output shard_grpc_addresses # "10.x.x.1:50051 10.x.x.2:50051 ..."terraform output gcs_bucket # gs://ufme-benchmark-dataterraform output estimated_hourly_cost # ~$6.03/hr (5x n2-highmem-16 ...)Deploy workflow
Section titled “Deploy workflow”Prerequisites
Section titled “Prerequisites”- GCP project with billing enabled
gcloudCLI authenticated:gcloud auth application-default loginterraform>= 1.5- GCP IAP API enabled:
gcloud services enable iap.googleapis.com
Provision infrastructure
Section titled “Provision infrastructure”cd deploy/terraformterraform initterraform plan -var="project_id=YOUR_PROJECT_ID"terraform apply -var="project_id=YOUR_PROJECT_ID"SSH to VMs via IAP
Section titled “SSH to VMs via IAP”# Shard VMgcloud compute ssh ufme-shard-0 \ --project=YOUR_PROJECT_ID \ --zone=europe-west2-a \ --tunnel-through-iap
# Orchestratorgcloud compute ssh ufme-orchestrator \ --project=YOUR_PROJECT_ID \ --zone=europe-west2-a \ --tunnel-through-iapCopy files to VMs via IAP SCP
Section titled “Copy files to VMs via IAP SCP”gcloud compute scp scripts/setup_shard_vm.sh ufme-shard-0:/tmp/ \ --project=YOUR_PROJECT_ID \ --zone=europe-west2-a \ --tunnel-through-iapStart/stop VMs (save cost)
Section titled “Start/stop VMs (save cost)”# Stop all VMsgcloud compute instances stop ufme-shard-0 ufme-shard-1 ufme-shard-2 \ ufme-shard-3 ufme-shard-4 ufme-orchestrator \ --zone=europe-west2-a --project=YOUR_PROJECT_ID
# Start all VMsgcloud compute instances start ufme-shard-0 ufme-shard-1 ufme-shard-2 \ ufme-shard-3 ufme-shard-4 ufme-orchestrator \ --zone=europe-west2-a --project=YOUR_PROJECT_IDTeardown
Section titled “Teardown”cd deploy/terraformterraform destroy -var="project_id=YOUR_PROJECT_ID"This removes all compute instances, firewall rules, NAT router, and the GCS bucket (including benchmark data). Nothing persists after destroy.
Shard setup
Section titled “Shard setup”After VMs are running, each shard must be initialised. The idempotent scripts/setup_shard_vm.sh script handles FAISS build, Rust build, IVF-PQ index training, and vector insertion. It is safe to re-run — guards check for existing installations before rebuilding.
# Copy setup script to shardgcloud compute scp scripts/setup_shard_vm.sh ufme-shard-0:/tmp/ \ --tunnel-through-iap --zone=europe-west2-a --project=YOUR_PROJECT_ID
# Launch in background (stdin redirect required to survive IAP session closure)gcloud compute ssh ufme-shard-0 --tunnel-through-iap \ --zone=europe-west2-a --project=YOUR_PROJECT_ID \ --command='SHARD_ID=0 nohup bash /tmp/setup_shard_vm.sh > /tmp/setup.log 2>&1 </dev/null &'
# Monitor progressgcloud compute ssh ufme-shard-0 --tunnel-through-iap \ --zone=europe-west2-a --project=YOUR_PROJECT_ID \ --command='tail -50 /tmp/setup.log'The script runs through 7 steps:
- Install system packages (build-essential, cmake, etc.)
- Build FAISS with CMake 3.31 (cmake.org binary, not Ubuntu’s outdated 3.22)
- Build Rust FAISS bindings
- Train IVF-PQ index (~105 min per shard — logs appear frozen during k-means)
- Insert 40M vectors
- Verify index integrity
- Start gRPC shard server on port 50051