Update README.md
Browse filesAdd reviewer-driven clarifications on preprocessing, model lifecycle, scalability, security, and data flow
README.md
CHANGED
@@ -59,4 +59,35 @@ kubectl apply -f hpa.yaml
|
|
59 |
## 🧪 Sample Inference Request
|
60 |
```bash
|
61 |
curl -X POST http://localhost:8000/infer -H "Content-Type: application/json" -d '{"input": "Patient data or image here"}'
|
62 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
## 🧪 Sample Inference Request
|
60 |
```bash
|
61 |
curl -X POST http://localhost:8000/infer -H "Content-Type: application/json" -d '{"input": "Patient data or image here"}'
|
62 |
+
```
|
63 |
+
|
64 |
+
## Enhancements Based on Peer Technical Review
|
65 |
+
|
66 |
+
### Preprocessing Execution Model
|
67 |
+
The NLP/CV preprocessing stage runs as an **independent Kubernetes microservice** for isolation and scale. The FastAPI Gateway performs **conditional routing**:
|
68 |
+
- `content_type=image/*` → CV preprocessor → Triton
|
69 |
+
- `content_type=text/*` → NLP preprocessor → Triton
|
70 |
+
- Already-normalized inputs → direct to Triton
|
71 |
+
A lightweight schema-validation step remains in the gateway.
|
72 |
+
|
73 |
+
### Model Lifecycle: Versioning, Promotion, Rollback
|
74 |
+
- Models are versioned under `/models/<name>/<version>` (e.g., `/models/ner/1`).
|
75 |
+
- CI/CD publishes to **staging**; promotion updates a **release tag** (e.g., `current -> 2`) for Triton to hot-reload.
|
76 |
+
- **Rollback** re-points the tag to the last known-good (`current -> 1`).
|
77 |
+
- Supports **blue‑green** (two deployments, Service selector switch) and **canary** (small % routed to a second Triton deployment).
|
78 |
+
|
79 |
+
### Scalability & Resilience
|
80 |
+
- **HPA** scales Triton pods based on CPU (and can extend to latency custom metrics).
|
81 |
+
- **Readiness/Liveness probes** guard rollout and enable auto‑healing.
|
82 |
+
- Gateway uses timeouts and retry on transient 5xx. If a pod is Unready, traffic shifts to healthy pods.
|
83 |
+
|
84 |
+
### Security, Compliance & Audit
|
85 |
+
- **TLS in transit**; optional mTLS inside cluster.
|
86 |
+
- **OAuth2/JWT** at the gateway with per‑route scopes.
|
87 |
+
- **Audit logs** (structured JSON with `request_id`) across gateway, preprocessors, and Triton; logs ship to ELK/Loki.
|
88 |
+
- Optional **PHI de‑identification** in preprocessors; strict schema validation; data minimization and retention controls aligned to HIPAA/GDPR.
|
89 |
+
|
90 |
+
### Data Flow & Validation
|
91 |
+
- Gateway enforces **MIME/JSON schema** and rejects malformed/unauthorized requests.
|
92 |
+
- Preprocessors normalize inputs (e.g., tokenize text, resize/normalize images).
|
93 |
+
- Triton returns prediction JSON; gateway maps to a domain response schema and may **redact** fields per policy.
|