Tenant Isolation for Large Scale Platforms in AWS

Bobby Tahir
3 min readJun 7, 2020
Photo by Martin Reisch on Unsplash

Why is Tenant Isolation Important?

Building your software ecosystem with tenant isolation baked into the architecture ensures that your business won’t suffer from a cascading compliance or security issue. It can also give your organization flexibility in terms of customizing your solution or services for specific market segments or product tiers.

Where to Start

First and most importantly, establish your requirements for tenant isolation by clarifying the needs of your customers, customer segments and market. Understand the compliance and security profiles & personas of your customers and how many need varying levels of isolation. As well as accurately capture how your customers require their data represented in unique or custom ways.

Developing an Isolation Strategy

For large-scale platforms it’s common to employ a hybrid of multiple isolation models, since it’s unlikely that a one-size-fits-all approach will work for the entirety of a large ecosystem of technologies. In fact, isolation strategies should likely be based on a per service basis.

Drivers for creating your isolation strategy can include: customer security concerns, industry compliance requirements, software customization expectations, legacy systems, product tiering, noisy neighbor considerations, and specific customer requests for data separation / partitioning.

Available AWS Options for Isolation

There are two overarching isolation models in AWS: Silo & Pool.

The Silo Model essentially says each tenant has their own technical stack and you should use course grained controls like separate VPCs and databases to create isolation.

The Pool Model says each tenant lives in a shared environment and you should use fine grained controls like IAM policies & authorization to segment access.

In the Silo Model:

- Isolation via separate VPC’s is common
- Isolation via separate AWS accounts is also effective
- Subnet-based isolation is also an option

In the Pool Model:

- Isolation using run-time policies are common via IAM
- Authorization/Authentication is a strong layer of protection
- API-enforced access is also quite common

Tactical-Level Approaches

At a tactical level, much of your isolation strategy will depend on the technologies and services currently deployed in your infrastructure.

  • Database storage isolation can involve implementing separate databases per tenant (Silo Model) or implementing one database for all tenants with a data model that uses a Tenant ID as a key (Pool Model). The approach here will depend on many factors include the database technologies you’re using. For example, Aurora has Row Level Security and DynamoDB can utilize IAM policies.
  • S3 storage can be isolated using tagged IAM policies.
  • Lambdas start out fairly well isolated by design. However, Lambda code still needs to be controlled in terms of what it is allowed or not allowed to do. This can be achieve via IAM policy.
  • Containers are evolving and complex and might require 3rd party isolation techniques. But Namespaces are often helpful as a fundamental unit of isolation when deployed in a Pooled Model where containers are being used. In a Silo Model containers can be thought of as a cluster of compute per tenant.
  • The AWS API Gateway service helps to create another layer of isolation although it shouldn’t be relied upon by itself. If designed well, it can control access to resources.
  • Authorization and Authentication using IAM and Cognito for example (other services like Auth0 can also be used) are helpful to provide appropriate tenant contexts to achieve segregation of resources.

Key Challenges

Finding the right mix of isolation models and approaches will become complicated the larger and more diverse your technical infrastructure is.

The challenge is building a well thought out roadmap that over time achieves the levels of isolation your business demands without putting in jeopardy the cost structure, stability, and manageability of your systems.

The process starts by identifying your customers and business requirements, and then carefully establishing the approach on a service-by-service basis.

Other Considerations

  1. Don’t assume code alone will ensure isolation. Mistakes can be made in code.
  2. AWS might not have all the services you need to achieve isolation. 3rd party services may be required.
  3. Make sure you know that the isolation is working. This can be done by writing isolation tests.
  4. Tenant isolation can be a costly endeavor both in time and dollars, so project expenses carefully.
  5. Partitioning is related to isolation. After deciding how to partition data to meet customer needs, isolation strategies must be applied on top.

--

--

Bobby Tahir

Technology leader scaling software, teams & business. Amateur mechanic. Book junkie. Enthusiastic garage gym owner. Connect with me on Twitter @bscalable