Managing Latency and Throughput

Problem Definition

Latency is a measure of how long a business transaction takes to complete. Throughput measures, on average, the number of business transactions possible per second while taking into account any lower or upper bounds which may point to bottlenecks. Defense against latency and throughput issues can be written into the Daml application during design.

First we need to identify the potential bottlenecks in a Daml application. We can do this by analyzing the domain-specific transactions.

Each Daml business transaction kicks off when a Ledger API client sends the commands create or exercise to a participant.

Important

  • Ledger transactions are not synonymous with business transactions.
  • Often a complete business transaction spans multiple workflow steps and thus multiple ledger transactions.
  • Multiple business transactions can be processed in a single ledger transaction through batching.
  • Expected ledger transaction latencies are on the order of 0.5-1 seconds on database sequencers, and multiple seconds on blockchain sequencers.

Refer to the Daml execution model that describes a ledger transaction processed by the Canton ledger. The table below highlights potential resource-intensive activities at each step.

Step Participant Resources used Possible bottleneck drivers
Interpretation
  • Submitting participant node
  1. CPU
  2. Memory
  3. DB read access
  1. Calculation complexity
  2. Size and number of variables
  3. Number of contract fetches
Blinding
  • Submitting participant node
  • CPU/memory
  • Number and size of views
Submission
  • Submitting participant node
  • Sequencer
  1. CPU
  2. Memory
  1. Serialization/deserialization
  2. Transaction size/number of views
Sequencing
  • Sequencer
  1. Backend storage
  2. Network bandwidth
  1. Transaction size
  2. Transaction size/number of views
Validation
  • Receiving participant nodes
  1. Network bandwidth
  2. CPU
  3. Memory
  4. DB read throughput
  1. Transaction size -> download, deserialization, storage costs
  2. Computation complexity
  3. Number of contract fetch reads
  4. Number and size of variables
Confirmation
  • Validating participant nodes
  • Sequencer
  1. Network bandwidth
  2. Sequencer network
  3. Backend write throughput
  • Number of confirming parties
Mediation
  • Mediator nodes
  1. Network throughput
  2. CPU
  3. Memory
  • Number of confirming parties
Commit
  • Mediator nodes
  • Sequencer
  1. CPU
  2. Memory
  3. DB
  4. Network bandwidth
  • Number of confirming parties

Possible Throughput Bottlenecks in Order of Likelihood

  1. Transaction size causing high serialization/deserialization and encryption/decryption costs on participant nodes.
  2. Transaction size causing sequencer backend overload, especially on blockchains.
  3. High interpretation and validation cost due to calculation complexity or memory use.
  4. Large number of involved nodes and associated network bandwidth on sequencer.

Latency can also be affected by the above factors. However, baseline latency usually has more to do with system set-up issues (DB or blockchain latency) rather than Daml modeling problems.

Solutions

  1. Minimize transaction size.

Each of the following actions in Daml adds a node to the transaction containing the payload of the contract being acted on. A large number of such operations, and/or operations of this kind on large contracts, are the most common cause of performance bottlenecks.

  • create
  • fetch
  • fetchByKey
  • lookupByKey
  • exercise
  • exerciseByKey
  • archive

Use the above actions sparingly. For example, if contracts have intermediary states within a transaction, you can often skip them by writing only the end state. For example:

template Incrementor
with
p : Party
n : Int
where
signatory p

choice Increment : ContractId Incrementor
    controller p
    do create this with n = n+1

-- This adds all m-1 intermediary versions of
-- the contract to the transaction tree
choice BadIncrementMany : ContractId Incrementor
    with m : Int
    controller p
    do foldlA (\self' _ -> exercise self' Increment) self [1..m]

-- This only adds the end result to the transaction
choice GoodIncrementMany : ContractId Incrementor
    with m : Int
    controller p
    do create this with n = n+m

When you need to read a contract, or act on a single contract in multiple ways, you can often bundle those operations into a single action. For example:

   template Asset
with
  issuer : Party
  owner : Party
  quantity : Decimal
where
  signatory [issuer, owner]

  -- BadMerge acts on each of the otherCids three times:
  -- Once for validation
  -- Once to extract the quantities
  -- Once to archive
  choice BadMerge : ContractId Asset
    with otherCids : [ContractId Asset]
    controller owner
    do
      -- validate the cids.
      forA_ otherCids (\cid -> do
        other <- fetch cid
        assert (other.issuer == issuer && other.owner == owner))

      -- extract the quantities
      quantities <- forA otherCids (\cid -> do
        other <- fetch cid
        return other.quantity)

      -- archive the others
      forA_ otherCids archive

      create this with quantity = quantity + sum quantities

  -- Allow us to do a fetch and an archive in one action
  choice ConsumingFetch : Asset
    controller owner
    do return this

  -- GoodMerge only acts on each of the other assets once.
  choice GoodMerge : ContractId Asset
    with otherCids : [ContractId Asset]
    controller owner
    do
      -- Get and archive the others
      others <- forA otherCids (`exercise` ConsumingFetch)

      -- validate
      forA_ others (\other -> do
        assert (other.issuer == issuer && other.owner == owner))

      -- extract the quantities
      let quantities = map (.quantity) others

      create this with quantity = quantity + sum quantities

Separate templates for large payloads that change rarely and require minimum access from those for fields that change with almost every action. This optimizes resource consumption for multiple business transactions.

This batching approach makes updates in one transaction submission rather than requiring separate transactions for each update. Note: this option can cause a small increase in latency and may increase the possibility of command failure but this can be avoided. For example:

template T
with
p : Party
where
signatory p

choice Foo : ()
    controller p
    do return ()

batching : Script ()
batching = do
p <- allocateParty "p"

-- without batching we have 10 ledger
-- transactions.
cid1 <- submit p do createCmd T with ..
cid2 <- submit p do createCmd T with ..
cid3 <- submit p do createCmd T with ..
cid4 <- submit p do createCmd T with ..
cid5 <- submit p do createCmd T with ..

submit p do exerciseCmd cid1 Foo
submit p do exerciseCmd cid2 Foo
submit p do exerciseCmd cid3 Foo
submit p do exerciseCmd cid4 Foo
submit p do exerciseCmd cid5 Foo

-- With batching, there are only two ledger transactions.
cids <- submit p do
replicateA 5 $ createCmd T with ..
submit p do
forA_ cids (`exerciseCmd` Foo)
  1. CPU and memory issues: Use the Daml profiler to analyze Daml code execution.
  2. Once you feel interpretation is not the bottleneck, scale up your machine.

Tip

Profile the JVM and monitor your databases to see where the bottlenecks occur.