Decommissioning Canton Nodes and Sync Domain Entities

This guide assumes general familiarity with Canton, in particular Canton identity management concepts and operations from the Canton console.

Note that, while onboarding new nodes is always possible, a decommissioned node or entity is effectively disposed of and cannot rejoin a sync domain. Decommissioning is thus an irreversible operation.

In addition, decommissioning procedures are currently experimental; regardless, backing up nodes to be decommissioned before decommissioning them is strongly recommended.

Decommissioning a Participant Node

Prerequisites

Be mindful that making a participant unavailable (by disconnecting it from the sync domain or decommissioning it) might block other workflows and/or prevent other parties from exercising choices on their contracts.

As an example, consider the following scenario:

  • Party bank is hosted on participant P1 and party alice is hosted on participant P2.
  • An active contract exists with bank as signatory and alice as observer.
  • P1 is decommissioned.

If bank is not multi-hosted, any attempt by alice to use the contract fails because bank cannot confirm. The contract remains active on P2 forever unless purged via the repair service and only non-consuming choices and fetches can be committed.

Similar considerations apply if P2 were to be decommissioned, even though alice is “only” an observer: if alice is not multi-hosted, the contract would remain active on P1 until purged via the repair service and only non-consuming choices and fetches could be committed.

Additionally, when P1 is decommissioned P2 stops receiving ACS commitments from P1, which may prevent pruning. The same applies in reverse if P2 is decommissioned.

Thus, properly decommissioning a participant requires the following high-level steps:

  1. Ensuring that the prerequisites are met: ensure that active contracts and workflows using them are not “stuck” due to parties required to operate on them becoming unavailable.

Note

More specifically, for a contract action to be committed:

  • For “create” actions all stakeholders must be hosted on active participants.
  • For consuming “exercise” actions all stakeholders, actors, choice observers, and choice authorizers must be hosted on active participants.

The definition of “informee” is covered by the ledger privacy model section.

The exact prerequisites to be met in order to decommission a participant therefore depend on the design of the Daml application and should be accounted and tested for in the initial Daml design process.

  1. Decommissioning: remove the participant from the topology state.

After that, the participant can be disposed of.

Decommissioning a participant once the prerequisites are met

  1. Stop applications from sending commands to the Ledger API of the participant to be decommissioned to avoid failed commands and errors.
  2. Disconnect the participant to be decommissioned from all sync domains as described in enabling and disabling connections.
  3. Use the sequencer.disable_member command to disable the participant being decommissioned in all sequencers and remove any sequencer data associated with it.
  4. Use the topology.participant_domain_states.authorize command to remove the participant from the domain topology via the domain manager.

The following code snippet demonstrates the last two steps:

// Disable the participant on all the sequencers to remove any sequencer data associated with it
//  and allow sequencer pruning
sequencers.all.foreach(_.sequencer.disable_member(participant2))

// Remove the participant from the topology
domain.topology.participant_domain_states.authorize(
  TopologyChangeOp.Remove,
  domainId,
  participant2.id,
  RequestSide.From,
)

Decommissioning a Sequencer

Sequencers are part of a sync domain’s messaging infrastructure and do not store application contracts, so they are disposable as long as precautions are taken to avoid disrupting the synchronization services. This means, concretely, ensuring that:

  1. No active participant nor active mediator is connected to the sequencer to be decommissioned.
  2. All active participants and mediators are connected to an active sequencer.

After that, the sequencer can be decommissioned by removing it from the sync domain’s topology and finally disposed of.

Disconnecting all nodes from the sequencer to be decommissioned

val conn1 = sequencer1.sequencerConnection
mediator1.sequencer_connection.set(SequencerConnections.single(conn1))
  • Change the domain manager’s sequencer connection to another active sequencer, using the sequencer_connection.set command:
val conn1 = sequencer1.sequencerConnection
domainManager1.sequencer_connection.set(conn1)
  • Reconnect participants to the sync domain, as described in domain connectivity, using a sequencer connectio to another active sequencer:
participant2.domains.disconnect(domainAlias)
participant2.domains.modify(
  domainAlias,
  _.copy(sequencerConnections = SequencerConnections.single(sequencer1.sequencerConnection)),
)
participant2.domains.reconnect(domainAlias)

Decommissioning the sequencer

Sequencers are part of the sync domain by virtue of having their node ID equal to the domain ID, which also means they all have the same node ID. Since a sequencer’s identity is the same as the sync domain’s identity, you should leave identity and namespace mappings intact.

However, a sequencer may use its own cryptographic material distinct from other sequencers. In that case, owner-to-key mappings must be removed for the keys it exclusively owns:

  1. Find the keys on the sequencer to be decommissioned using the keys.secret.list command.
  2. Among those keys, find the ones not shared by other sequencers. You can do this by issuing the keys.secret.list command on each of them: the fingerprints that appear only on the sequencer node to be decommissioned correspond to its exclusively-owned keys.
  3. Remove the mappings for its exclusively owned keys using the topology.owner_to_key_mappings.authorize command.
def keyIdAndPurpose(privateKeyMetadata: PrivateKeyMetadata) =
  (privateKeyMetadata.id, privateKeyMetadata.purpose)

val sequencer2KeyIdsAndPurposes = sequencer2.keys.secret.list().map(keyIdAndPurpose)
val sequencer2ExclusiveKeyIdsAndPurposes =
  sequencer2KeyIdsAndPurposes.toSet.diff(
    sequencers.all
      .filter(_.name != sequencer2.name)
      .flatMap(_.keys.secret.list())
      .map(
        keyIdAndPurpose
      )
      .toSet
  )
sequencer2ExclusiveKeyIdsAndPurposes.foreach { case (keyId, keyPurpose) =>
  domainManager1.topology.owner_to_key_mappings
    .authorize(
      TopologyChangeOp.Remove,
      sequencer2.id,
      keyId,
      keyPurpose,
    )
}

// Avoids error logs in the decommissioned sequencer that are to be expected but may be confusing
sequencer2.stop()

// The increased timeout is needed here: once the sequencer is stopped, the other sequencers
//  need a bit of time to mark it as offline.
participant1.health.ping(participant2, timeout = 30.seconds)

// The sequencer and the locally stored cryptographic material can now be disposed of

Finally, the cryptographic material exclusively owned by a decommissioned sequencer must also be disposed of:

  • If it was stored only on the decommissioned sequencer, it must be disposed of together with the decommissioned sequencer node.
  • However, if a decommissioned sequencer’s cryptographic material is managed via a KMS system, it must be disposed of through the KMS; refer to your KMS’ documentation and internal procedures to handle this. KMS-managed cryptographic material of sequencer nodes.

Decommissioning a Mediator

Mediators are also part of a sync domain’s messaging infrastructure and do not store application contracts, so they are disposable as long as precautions are taken to avoid disrupting the synchronization services. This means ensuring that at least one mediator remains on the sync domain.

If other mediators exist on the sync domain, a mediator can be decommissioned using a single console command setup.offboard_mediator.

// `setup.offboard_mediator` will log a warning with a message similar to the following:
//
//  "Mediator ... is deactivated and will miss out on topology transactions. This will break it"
//
//  This just means that the decommissioned mediator won't be able to join the domain anymore, as it won't receive
//  topology transactions and its topology state will thus fork, which is an irreversible fault.
//  Off-boarding a mediator is indeed an irreversible operation, but here it is being intentionally and explicitly
//  requested, so in this case this warning can be safely ignored.
domainManager.setup.offboard_mediator(mediatorId)
mediator.stop()