Threat Modeling — Cloud (Part 1)
In this post I have tried to put together my thoughts on threat modeling in cloud.
Note: The intent is not to teach Threat Modeling it’s just a thought process to identify things to consider while doing threat modeling in cloud.
Few years back when I was embarking on this journey of cloud I wasn’t sure—
- Would the approach for threat modeling in cloud be the same ?
- What should I consider or should I not consider when I look at Threat modeling in cloud?
- How would shared responsibility impact the Threat Model ??? (Threat Model for IaaS vs PaaS vs SaaS)
- Should I also think Zero Trust here ??? What about Serverless ?? Oh and what about Containerized workloads??
- What should be my trust boundary ??? Should I just say — on premise (Organization datacenter), cloud (AWS/Azure/GCP). Or — Network Boundaries mapped to trust boundary ( which tend to blur when you talk about cloud) ….. This is definitely confusing
- What threats should I look for — do I follow one of the frameworks or do I leverage threat matrix , do I develop a checklist based on threats in cloud ??? What should I do ? Then I came across the Cloud Security Alliance The Treacherous 12 document (now the Egregious Eleven) document it definitely helped in ironing few thoughts that I had.
Now post so much confusion and reading multiple documents (including the different documents from cloud providers, recent breaches for workloads in cloud and the root cause of the same) I documented following:
What is In-scope when you think of workloads hosted in cloud ?
- Network
- Hub and Spoke model , VNeT peering/VPC peering
- VNet, VPC Configurations — (Network) Security Group, Network Access Control Lists
Misconfigurations have been major reason for past breaches in cloud — Native solutions or a third-party solution can help you baseline the security configurations and set conditions that will allow you to enforce the defined baselines. Examples — This is just a subset of what can be done to implement the control a. AWS Security Hub/Azure Security Center b. Configuration Management — Chef/Puppet/Ansible/Saltstack (Use CIS benchmarks) c. Prisma Cloud /AquaSecurity
- Single region/ Multi Region / Zones (referring to availability zones and not DMZ/public/private zones)
- Some of the organizations I have seen try doing traffic inspection — Most of Next Generation Firewalls (NGFws ) provide this capability in cloud need to understand in what context we would want to use this — it does add an overhead …. :encryption decryption encryption (if at all the plan is to use it inline) + need to look at in the context of the application that you have at hand (where do you exactly perform SSL offloading and why) … not going too much into detail here … AWS Traffic monitor provides some capability to inspect traffic (more like an IDS) + GCP just launched IDS in cloud
- Load Balancers
- API Gateway
- WAF
- DDoS Protection — Is there an additional layer for DDoS protection that the architecture has ? (Shield…)
2. Different identities or accounts
- User account (pretty straight forward right :))
- Non human user accounts — Service Principal (Managed Identities):{ there is always confusion between the SP and MI refer to this blog — it’s an interesting explanation on the two) / Assumed IAM roles in AWS/ Services Accounts in GCP (if you include containerized workloads then Service accounts for pods)
3. API endpoints and Access
- REST/SOAP/GraphQL
- AuthN & AuthZ
- API keys / Access token — TTL
4. Microservices
- Inter-service communication — Authenticated, unauthenticated ?
- User context — How is user context managed between the calls for downstream services ?
5. Identity and Access Management
- IAM Roles, policies
- Permissions
- Least privilege
6. Cloud Native Services
- Security best practice/ baseline by Cloud Provider or Organizations hardening standard
7. Datastores
- Backup
- Encryption of primary and backup datastore (Customer managed Keys (CMK)/ Cloud Provider Managed Keys — Pros/Cons of these, process overhead and controls required for CMK — Key Rotation, Key Management , Key Distribution)
- Access to these datastores
Since we mention encryption and identity so many times might as well consider talking about secrets management (Secret zero problem)
8. Serverless
- Blast Radius for Lambda/ Cloud Run/ Azure Function
- IAM Roles Assigned to the functions
- API Calls
- Associated Secrets/Environment Variables
9. Logs
- Blast Radius for Lambda/ Cloud Run/ Azure Function
- IAM Roles Assigned to the functions
- API Calls
- Associated Secrets/Environment Variables
Coming Soon …. Trust Boundaries, Data Flow Diagrams, Example System Design and associated Threat Model