Multi-tenancy for databases

Maarten Merken
6 min readSep 25, 2019

Some while ago, I wrote about supporting multiple authentication schemes in ASP.NET Core. This allows your API consumers to use various authentication methods, I’d like to segue this into multi tenancy on the database level.

Let’s visualize a tenanted microservice from the perspective of the data.

The most straightforward way of architecting a microservice, is by a per-tenant basis. Each tenant gets its own version of the microservice(s) and is maintained separately. Its data resides in its own datastore, though this can be a different persistence-store technology per microservice.

Here we see tenant1 and tenant2, both with the same architecture, each having their own instances of their microservices, pointing to their persistence-store. We could have a relational database, a file-bases storage or an Azure table storage as persistence-store.

The advantage of this setup is that the two tenants are completely independent of one another. We could upgrade tenant1 to a newer version of the microservice(s) at any time, without affecting other tenants.

The downside of this is that the other tenants won’t reap the benefits of new development. Another downside is that this setup implies that there’s a front-end (https://x.mycorp.com) for each tenant, which shares the same pros and cons. To eliminate this, we look to the next setup;

In this architecture, the front-end is a shared entity, it needs to adapt to whatever tenant context it resides in. You’ll see a different UI depending on the URL you enter. Although the front-end is the same deployed instance for all tenants, each tenant has a dedicated microservice array in the backend. A gateway is introduced to divert the traffic to the microservices of the respected tenant context. This solves the issues on the front-end. However, ideally, we’d have this separation mitigated completely;

This diagram shows a more pristine setup, the front-end is a shared system, one deployable unit. The gateway handles traffic based on the tenant context and diverts it to the microservices.

We’ve now pushed our problem down from the front-end through to the back-end and into the persistence layer.

What’s left for us is to handle the multitenancy on the level of the database.

Although this article will specifically target multitenancy in relational databases, most of the principles can be applied to Azure Table Storage, document-stores and other persistence technologies.

Discriminator Column

Also known as Table-based multitenancy.

In the picture above, we see our database that stores data for two tenants (Dell and HP).

All data resides in the same table(s), but it’s separated by a discriminator column called Tenant.

Each query you write will need to include a WHERE statement to select data based on the tenant context.

You’ll likely have a composite primary key set to the ID and Tenant column, combined, to enforce uniqueness within the tenant.

This approach is ideal if:

  • You don’t want to setup new servers/databases to support multitenancy
  • Your data size per tenant is limited
  • You’ll need to share data across tenants, this is faster if shared data resides in the same DB (via views)

Drawbacks to this approach:

  • Schema changes affect all tenants in the database
  • Database load is shared, slow tenants will experience the same performance issues as large tenants
  • Possible data leak (tenant1 sees data from tenant2)
  • The code needs to be tenant-aware

Schema based multitenancy

In this configuration, the data for each tenant still resides inside the same database, in the same server. However, there’s a concrete separation of data via means of a different schema per tenant.

Your queries need to be schema-aware based on the tenant.

This introduces a few benefits in contrast to the previous approach:

  • Schema changes can happen independent of other tenants in the same DB
  • Share common data in a common schema (dbo)
  • Azure SQL Database supports cross schema queries (elastic query, external tables)

Some drawbacks to this approach:

  • Database server load is still shared across tenants
  • Introduces complexity
  • Possible data leak due to no real physical separation
  • The code still needs to be tenant-aware

Database-based multitenancy

Also known as catalog-based multitenancy

In this composition, the databases per tenant are completely separated. They do reside on the same database server, however. This eliminates the need for the code to be tenant-aware, since you could manage this on the connection level.

Benefits to this approach:

  • Complete separation of data
  • Data sharing possibilities through elastic queries and external tables
  • The code does not need to be tenant aware
  • Still a reasonably low-cost solution, since only 1 database server license is required

Drawbacks:

  • Server load is still shared across tenants

Regarding server load, you could use Azure Elastic Database pools to mitigate this issue.

To take this one step further, another setup would be to split the tenant databases into their own database servers;

Server-based multitenancy

This approach is ideal if you need to scale your database separately and indefinitely per tenant. Some tenants require larger workloads than others, this allows you to meet the needs of the tenant in terms of database performance.

As with database-based multitenancy, the code does not need to be tenant-aware, this separation can be handled on connection level.

Some pointers to handle multitenancy in general

You might need to ask yourself the following question; what’s the frequency of rolling out a new tenant ? Daily ? Monthly ? Once in a blue moon ?

There are two distinctions here, either creating a new tenant needs to be an automated process that requires minimal effort or it’s a manual process that can be handle via good documentation or a manual. The point is this; don’t over-engineer or pre-optimize this process if it’s not actually required. Sure, a fully automated system looks nice, but it comes with additional maintenance and support for when it goes wrong. Most of the time, good coordination and checklist-style processes suffice to roll out a new tenant.

I’ve seen systems where the tenant can be dynamic, it got looked up from a master database, its configuration loaded, routing setup and traffic diverted to each respected endpoint based on the dynamic context. Beautiful to see.

You could change endpoints at runtime, divert traffic for a specific tenant to a whole new set of microservice instances, but when it came down to debugging a specific flow for a specific tenant, it became hell.

The additional lookup took half of the response time for each call, caching was introduced, but this added to the already vast complexity of the system. We were better off having static dedicated routes for each tenant, it would’ve been messier, but the path of debugging a flow and getting an issue resolved would’ve been more trivial.

Wrapping up

I hope you found this article interesting, I’m planning a follow-up article to put this to practice in EF Core using a new feature in EF Core 3.

--

--