Spaces:

digopala
/

ai-inference-architecture-healthcare

Running

App Files Files Community

digopala commited on 10 days ago

Commit

4a81e6f

verified ·

1 Parent(s): 7720c4d

Update README.md

Browse files

Add reviewer-driven clarifications on preprocessing, model lifecycle, scalability, security, and data flow

Files changed (1) hide show

README.md +32 -1

README.md CHANGED Viewed

@@ -59,4 +59,35 @@ kubectl apply -f hpa.yaml
 ## 🧪 Sample Inference Request
 ```bash
 curl -X POST http://localhost:8000/infer   -H "Content-Type: application/json"   -d '{"input": "Patient data or image here"}'
-```

 ## 🧪 Sample Inference Request
 ```bash
 curl -X POST http://localhost:8000/infer   -H "Content-Type: application/json"   -d '{"input": "Patient data or image here"}'
+```
+## Enhancements Based on Peer Technical Review
+### Preprocessing Execution Model
+The NLP/CV preprocessing stage runs as an **independent Kubernetes microservice** for isolation and scale. The FastAPI Gateway performs **conditional routing**:
+- `content_type=image/*` → CV preprocessor → Triton
+- `content_type=text/*` → NLP preprocessor → Triton
+- Already-normalized inputs → direct to Triton
+A lightweight schema-validation step remains in the gateway.
+### Model Lifecycle: Versioning, Promotion, Rollback
+- Models are versioned under `/models/<name>/<version>` (e.g., `/models/ner/1`).
+- CI/CD publishes to **staging**; promotion updates a **release tag** (e.g., `current -> 2`) for Triton to hot-reload.
+- **Rollback** re-points the tag to the last known-good (`current -> 1`).
+- Supports **blue‑green** (two deployments, Service selector switch) and **canary** (small % routed to a second Triton deployment).
+### Scalability & Resilience
+- **HPA** scales Triton pods based on CPU (and can extend to latency custom metrics).
+- **Readiness/Liveness probes** guard rollout and enable auto‑healing.
+- Gateway uses timeouts and retry on transient 5xx. If a pod is Unready, traffic shifts to healthy pods.
+### Security, Compliance & Audit
+- **TLS in transit**; optional mTLS inside cluster.
+- **OAuth2/JWT** at the gateway with per‑route scopes.
+- **Audit logs** (structured JSON with `request_id`) across gateway, preprocessors, and Triton; logs ship to ELK/Loki.
+- Optional **PHI de‑identification** in preprocessors; strict schema validation; data minimization and retention controls aligned to HIPAA/GDPR.
+### Data Flow & Validation
+- Gateway enforces **MIME/JSON schema** and rejects malformed/unauthorized requests.
+- Preprocessors normalize inputs (e.g., tokenize text, resize/normalize images).
+- Triton returns prediction JSON; gateway maps to a domain response schema and may **redact** fields per policy.