Cloud security architecture for a medical group, three years on.
Our 2023 architecture for a medical service group — VPN, segmentation, ELK, AWS+Azure dual-cloud — held up. The pieces that aged are exactly the ones that became MGF-relevant in 2026.
Three years ago we wrote up the security architecture we’d built for a Singapore medical service group — a dual-cloud (AWS + Azure) deployment with OpenVPN, network segmentation, AWS Shield / Azure DDoS, ELK for centralized logs, and a custom incident-detection pipeline using Lambda + Azure Functions. The bones are still in production. The pieces that aged are the ones now in scope under IMDA’s Model Governance Framework for Agentic AI (MGF, January 2026) and the increasingly active PDPC enforcement around medical data.
What carried over unchanged
- Dual-cloud structure. We still run primary on AWS, secondary on Azure, with a documented DR runbook. The argument for it has actually strengthened — buyers asking about agentic-AI risk now ask the same question of cloud-vendor lock-in.
- Network segmentation by sensitivity tier. Patient-data subnet, public-portal subnet, log-aggregation subnet, admin-jumphost subnet. Same shape today.
- OpenVPN for ops access. Boring, well-understood, audited. We never moved off it.
- Server hardening + least-privilege SSH. Same defaults, more documentation.
What we changed in 2024
A few things shifted over the last 18 months, before MGF was even on the table:
From OpenVPN to SSO-fronted bastions for human ops
OpenVPN is still in the architecture for site-to-site. For human operator access, we moved to SSO-fronted bastion hosts (Azure AD / Entra ID, or AWS IAM Identity Center). It removed a class of credential-management mistakes — every operator now logs in with a corporate identity, and access is time-boxed.
From “AWS Managed Antivirus” to EDR
The 2023 article mentioned AWS Managed Antivirus and Azure Antimalware. Those have been deprecated in favour of EDR products (CrowdStrike, SentinelOne, Defender for Cloud) for the workloads that warrant them. The medical group is one such workload.
ELK is now OpenSearch
A practical migration: we run OpenSearch Service (AWS) with cross-account replication to a secondary region, plus a small Azure Sentinel feed for cross-cloud correlation. ELK as a self-managed stack stopped being worth the operational tax around 2024.
What MGF added in 2026
The framework was published in January 2026 and aligned, by accident or design, almost exactly with the controls we’d already had in this architecture. A few items now warrant explicit treatment:
Identity for AI agents
If a medical workload runs any agent (a triage assistant, a coding-suggestions agent for the records team, an admin chat agent), MGF treats that agent as a principal needing its own scoped IAM. Practically:
- The agent has a dedicated IAM role / Entra service principal.
- Its access is scoped to a vetted procedure surface, not raw record tables.
- Every agent action carries the agent identity in the audit log.
For this client we retrofit an agent-identity layer in 2025; if you’re starting fresh, design it in.
Audit log integrity, off-box
The 2023 architecture sent SQL Server and Windows logs to ELK. In 2026 we additionally:
- Ship logs to a write-only sink the DBA cannot delete from.
- Cross-account replication of the log store into a separate AWS account whose credentials no operational team holds.
- Daily alive-check of the audit pipeline, alerted on failure.
Model-version pinning and inference logging
Every LLM call from any agent in the system is logged with: model name, model version, prompt template hash, response, cost, latency, and the user identity that triggered it. This is now part of the audit story we hand to procurement when a buyer asks “how do you know your model didn’t drift.”
What we got wrong in the 2023 design
Two things we’d flag honestly:
-
The “incident detection within 10 seconds” claim in the original article was based on the Lambda + Azure Function handshake. In practice, end-to-end from event to PagerDuty page is closer to 45 seconds in a normal load and a few minutes under stress. The 10-second number was misleading; we’ve since talked about it as “near-real-time,” which is what it is.
-
OpenVPN as the sole ops access path. Adequate, but moving to SSO-fronted bastions earlier would have saved us a credential-rotation incident in 2024. We caught the problem; we’d rather have prevented it.
Architecture today (2026)
Internet
│
├── public-portal subnet ──────── (patient-facing, no PHI)
│
├── ops bastion (SSO/MFA) ─────── (human admin access)
│
└── agent egress (allow-listed) ─ (LLM calls to model gateway)
Private side:
app subnet ──── workload (with agent processes, scoped IAM)
│
├── data subnet ── encrypted RDS / Azure SQL (TDE + column-level)
├── log subnet ──── OpenSearch + write-only S3 sink + Sentinel feed
└── admin subnet ── jumphost-only access, full session recording
What we tell new medical clients
If you’re a Singapore medical group reading this in 2026:
- PDPC will not accept “we use a cloud provider, they handle security” as a posture.
- MGF will be a procurement question soon if it isn’t already.
- The boring controls — encryption, audit, identity, segmentation — compound. Skip them and the dramatic-sounding ones don’t help.
- If you’re going to add an agent anywhere in the workflow, design its identity and its audit log first. Always.
— wGrow studio · migrated from Team-Notes #85