digopala commited on
Commit
4a81e6f
·
verified ·
1 Parent(s): 7720c4d

Update README.md

Browse files

Add reviewer-driven clarifications on preprocessing, model lifecycle, scalability, security, and data flow

Files changed (1) hide show
  1. README.md +32 -1
README.md CHANGED
@@ -59,4 +59,35 @@ kubectl apply -f hpa.yaml
59
  ## 🧪 Sample Inference Request
60
  ```bash
61
  curl -X POST http://localhost:8000/infer -H "Content-Type: application/json" -d '{"input": "Patient data or image here"}'
62
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  ## 🧪 Sample Inference Request
60
  ```bash
61
  curl -X POST http://localhost:8000/infer -H "Content-Type: application/json" -d '{"input": "Patient data or image here"}'
62
+ ```
63
+
64
+ ## Enhancements Based on Peer Technical Review
65
+
66
+ ### Preprocessing Execution Model
67
+ The NLP/CV preprocessing stage runs as an **independent Kubernetes microservice** for isolation and scale. The FastAPI Gateway performs **conditional routing**:
68
+ - `content_type=image/*` → CV preprocessor → Triton
69
+ - `content_type=text/*` → NLP preprocessor → Triton
70
+ - Already-normalized inputs → direct to Triton
71
+ A lightweight schema-validation step remains in the gateway.
72
+
73
+ ### Model Lifecycle: Versioning, Promotion, Rollback
74
+ - Models are versioned under `/models/<name>/<version>` (e.g., `/models/ner/1`).
75
+ - CI/CD publishes to **staging**; promotion updates a **release tag** (e.g., `current -> 2`) for Triton to hot-reload.
76
+ - **Rollback** re-points the tag to the last known-good (`current -> 1`).
77
+ - Supports **blue‑green** (two deployments, Service selector switch) and **canary** (small % routed to a second Triton deployment).
78
+
79
+ ### Scalability & Resilience
80
+ - **HPA** scales Triton pods based on CPU (and can extend to latency custom metrics).
81
+ - **Readiness/Liveness probes** guard rollout and enable auto‑healing.
82
+ - Gateway uses timeouts and retry on transient 5xx. If a pod is Unready, traffic shifts to healthy pods.
83
+
84
+ ### Security, Compliance & Audit
85
+ - **TLS in transit**; optional mTLS inside cluster.
86
+ - **OAuth2/JWT** at the gateway with per‑route scopes.
87
+ - **Audit logs** (structured JSON with `request_id`) across gateway, preprocessors, and Triton; logs ship to ELK/Loki.
88
+ - Optional **PHI de‑identification** in preprocessors; strict schema validation; data minimization and retention controls aligned to HIPAA/GDPR.
89
+
90
+ ### Data Flow & Validation
91
+ - Gateway enforces **MIME/JSON schema** and rejects malformed/unauthorized requests.
92
+ - Preprocessors normalize inputs (e.g., tokenize text, resize/normalize images).
93
+ - Triton returns prediction JSON; gateway maps to a domain response schema and may **redact** fields per policy.