High Availability (HA) is the elimination of single points of failure to ensure that applications continue to operate when a component they depend on, such as a server, fails.
HA for Daml solutions focuses on the following components running in separate processes:
- Domain Topology Manager
The availability of a participant node shouldn’t affect the availability of another participant node, except for the following workflows:
- Where they are both involved.
- When they have distinct visibility configurations, i.e. they manage different parties involved in the workflow.
For example, if they both host the same party, transactions involving the party can continue as long as either of them is available.
An application operating on behalf of a party cannot transparently failover from one participant node to another due to the difference in offsets emitted on each participant.
A participant node’s availability is not affected by the availability of the domain, except for workflows that use the domain. This allows participant nodes and domains to take care of their HA separately.
To achieve HA, components replicate. All replicas of the same component have the same trust assumptions, i.e. the operators of one replica must trust the operators of the other replicas.
In general, when a component is backed by a database/ledger, the component’s HA relies on the HA of the database/ledger. Therefore, the component’s operator must handle the HA of the database separately.
All database-backed components are designed to be tolerant of temporary database outages. During the database failover period, components halt processing until the database becomes available again, resuming thereafter.
Transactions that involve these components may time out if the failover takes too long. Nevertheless, they can be safely resubmitted, as command deduplication is idempotent.