# High Availability Architecture

This document describes the architecture of a high availability ESS deployment on Amazon Web Services (AWS).

## Architecture Overview

### Virtual Private Cloud

ESS runs within a Virtual Private Cloud.

### Availability Zones

Inrupt recommends running in at least 3 availability zones. Each availability zone should contain a public subnet and a private subnet. Only the load balancer should run on the public subnet.

### Gateway (Optional)

Although not strictly required for high availability, an application gateway can sit in front of the VPC.

If you choose to add a gateway component, ensure that the gateway can support availability across multiple availability zones and balance traffic as needed.

### Load Balancers

Load Balancers are the only ESS component in the public subnets. If reliance on the SLA of a single load balancer for the ESS deployment insufficient, you can use multiple load balancers; i.e., have a load balancer in each availability zone. You can even use another load balancer to front the multiple load balancers.

The Load Balancers will each have public DNS addresses. The following CNAME entries are required in your DNS configuration; substitute your domain for **`<DOMAIN>`** :

* **`<DOMAIN>`**
* **`*.<DOMAIN>`**

### PostgreSQL

The ESS system stores data in multiple PostgreSQL databases.

This guide assumes the use of Amazon’s managed service Relational Database Service (RDS). Inrupt recommends enabling “Multi-AZ” on RDS instances.

### Object Storage (S3)

The ESS system stores resource content in Object Storage.

This guide assumes the use of Amazon Simple Storage Service (Amazon S3).

### Messaging System (Kafka)

ESS is event driven and uses a messaging system for asynchronous inter-service communication.

This guide assumes the use of Amazon Managed Streaming for Apache Kafka (MSK).

{% hint style="info" %}
**Tip**\
Kafka **MUST** be configured with topic auto-creation enabled (i.e. `auto.create.topics.enable = true` ). See [Custom MSK Configurations](https://docs.aws.amazon.com/AWSCloudFormation/latest/TemplateReference/aws-resource-msk-configuration.html) .
{% endhint %}

### Secrets/Certificate Management

Various services related to secrets and certificate management are generally only required during container or service start/stop/restart. As such, outside of these periods, ESS can continue to operate during an outage to one of these services.

{% hint style="danger" %}
**Warning**

**CRITICAL SECURITY REQUIREMENT**

NEVER commit files containing secrets such as **`.env`** or **`JWT`** to version control. These files must be managed securely.

As part of updating the inputs for your deployment:

1. **Review** the template secret files
2. **Set strong secrets** for the values, such as strong passwords
3. **Store the secret securely** outside your repository using one of these methods:
   * Cloud secrets management service
   * Enterprise secrets vault solution
   * Kubernetes Secrets with encryption at rest
   * Secure file system with restricted access (development only)
4. **Configure your deployment** to retrieve credentials from your secure storage at runtime
5. **Add the secrets files to your `.gitignore` file immediately**
   {% endhint %}

### Container Registry

Inrupt recommends <mark style="color:red;">**against**</mark> relying on a public repository, such as Inrupt’s own software releases. Instead, pull the Docker images from your own container registry.

The container registry is only required during container startup if you need to pull the Docker image. Outside of this reliance, ESS can continue to operate during a container registry outage.

## Additional Information

* [AWS SLAs](https://aws.amazon.com/legal/service-level-agreements/)
