Daml Driver for PostgreSQL¶
Daml Driver for PostgreSQL is a PostgreSQL-based Daml ledger implementation.
The Community Edition, distributed as a JAR file called daml-on-sql-<version>.jar, can be obtained from the Daml GitHub Releases Page.
Setup PostgreSQL and run¶
Before starting, you need to perform the following steps:
- create an initially empty PostgreSQL database that Daml Driver for PostgreSQL can access
- create a database user for Daml Driver for PostgreSQL that has authority to execute DDL operations
This is because Daml Driver for PostgreSQL manages its own database schema, applying migrations if necessary when upgrading versions.
To specify the PostgreSQL instance you wish to connect, use the
--sql-backend-jdbcurl <value> command line option, where
<value> is a
valid JDBC URL containing the username, password and database name to connect
to (for example,
If you do not want to pass the jdbc credentials on the command line, you can also
define them in an environment variable and then specify the environment variable to
use through the
--sql-backend-jdbcurl-env <environment-variable> option where
<environment-variable> is the name of the environment variable where the connection
string string has been specified. The connection string you store in the environment
variable should be of the same form as mentioned above (for example,
If you want to setup the database schema for the Daml Driver for PostgreSQL with a
more highly permissioned database service account that is able to create database schemas
and tables but only run your application with an account with lesser permissions, you
can leverage the combination of the connection string specifying the account to use
--sql-start-mode <value> parameter specifying the mode to run the driver.
The possible modes available are
migrate-only - Create database schema but do not start the driver
migrate-and-start - Create database schema and start the driver. If the database already exists, this will simply validate and start. This is the default mode
You will also need to provide a ledger ID with the
--ledgerid flag, which
must be the same upon restart. This value is expected in many API endpoints, to
ensure ledger clients are connecting to the correct ledger.
Due to possible conflicts between the
& character and various shells, we
recommend quoting the JDBC URL in the terminal, as follows:
$ java -jar daml-on-sql-<version>.jar --ledgerid=test --sql-backend-jdbcurl='jdbc:postgresql://localhost/test?user=fred&password=secret'
If you are not familiar with JDBC URLs, we recommend reading the PostgreSQL JDBC documentation for more information.
Architecture and availability¶
Processes and components¶
The core processes necessary to run a Daml Driver for PostgreSQL deployment are:
- the Daml Driver for PostgreSQL server, and
- the PostgreSQL server used to persist the ledger data.
Daml Driver for PostgreSQL communicates with the external world via the gRPC Ledger API and communicates with PostgreSQL via JDBC to persist transactions, keep track of active contracts, store compiled Daml packages, and so on.
Server hardware and software requirements¶
Daml Driver for PostgreSQL is provided as a self-contained JAR file, containing the application and all dependencies. The application is routinely tested with OpenJDK 8 on a 64-bit x86 architecture, with Ubuntu 16.04, macOS 10.15, and Windows Server 2016.
In production, we recommend running on a 64-bit x86 architecture in a Linux environment. Core requirements in such a situation include:
- a Java SE Runtime Environment such as OpenJDK JRE - must be compatible with OpenJDK version 1.8.0_202 or later
- OpenSSL 1.1 or later, made available to the above JRE
- glibc, made available to the above JRE
As a Java-based application, Daml Driver for PostgreSQL can work on other operating systems and architectures supporting a Java Runtime Environment. However, such an environment will not have been tested and may cause issues.
Core architecture considerations¶
The backing PostgreSQL server performs a lot of work which is both CPU- and IO-intensive: all (valid) Ledger API requests will eventually hit the database. At the same time, the Daml Driver for PostgreSQL server has to have available resources to validate requests, evaluate commands and prepare responses. While the PostgreSQL schema is designed to be as efficient as possible, practical experience has shown that having dedicated computation and memory resources for the two core components (the ledger server and the database server) allows the two to run without interfering with each other. Depending on the kind of deployment you wish to make, this can be achieved with containerization, virtualization or simply using physically different machines. Still, the Ledger API communicates abundantly with the database server and many Ledger API requests need to go all the way to persist information on the database. To reduce the latency necessary to serve outstanding requests, the Daml for PostgreSQL server and PostgreSQL server should be physically co-located.
Core availability considerations¶
In order to address availability concerns, it is important to understand what each of the core components do and how they interact with each other, in particular regarding state and consistency.
Having two Daml Driver for PostgreSQL servers running on top of a single PostgreSQL server can lead to undefined (and likely broken) behavior. For this reason, you must maintain a strict 1:1 relationship between a running Daml for PostgreSQL server and a running PostgreSQL server. Note that using PostgreSQL in a high-availability configuration does not allow you to run additional Daml for PostgreSQL servers.
Downtime for the Daml Driver for PostgreSQL server can be minimized using a watchdog or orchestration system taking care of evaluating its health of the core components and ensuring its availability. The Ledger API exposes the standard gRPC health checkpoint that can be used to evaluate the health status of the Ledger API component. More information on the endpoint can be found at the documentation for gRPC.
When overloaded, the ledger will attempt to refuse additional requests, instead
responding with a
RESOURCE_EXHAUSTED error. This error represents
backpressure, signaling to the client that they should back off and try again
later. Well-behaving clients will therefore allow the ledger to catch up with
outstanding tasks and resume normal operations.
Scale the ledger and associated services¶
The command-line interface provides multiple configuration parameters to help tune for availability and performance.
--max-inbound-message-size. You can use this parameter to increase (or decrease) the maximum size of a GRPC message. Often, DARs or transactions can become larger than the default of 4194304 bytes (4 MB). Increasing this will allow for larger transactions, at the expense of processing time.
--events-page-size. When streaming transactions, the API server will query the database in pages defaulting to a size of 1000. Increasing the page size can increase performance on servers with enough available memory.
--max-commands-in-flight. Increasing the maximum number of commands in flight will allow the API server to support more concurrent synchronous writes per party, at the expense of greater CPU and memory usage. The default maximum is 256, after which clients will receive a
Clients can also increase the number of concurrent requests by using the asynchronous endpoints for command submission and completion.
--max-parallel-submissions. Increasing the maximum number of parallel submissions from the default will allow for a larger queue of commands, but will also increase the CPU and memory demands of the ledger. The default maximum is 512, after which clients will receive a
--max-lf-value-translation-cache-entries. In production, it is typical for many requests to be similar, resulting in the transaction verification and translation layer repeating a lot of work. Specifying a value for the translation cache allows the results of some of this repetitive work to be cached. The value represents the number of cached entries.
This parameter can be tuned by observing its metrics, described below.
Security and privacy¶
In Daml Driver for PostgreSQL, all data is kept centrally by the operator of the deployment. Thus, it is their responsibility to ensure that the data is treated with the appropriate care so to respect confidentiality and the applicable regulations.
The ledger operator is advised to use the tools available to them to not divulge private user data, including those documented by PostgreSQL, to protect data at rest and using a secure communication channel between the Daml Driver for PostgreSQL server and the PostgreSQL server.
Ledger API over TLS¶
To protect data in transit and over untrusted networks, the Ledger API leverages TLS support built into gRPC to allow clients to verify the identity of the server and encrypt the communication channel over which the Ledger API requests and responses are sent.
To enable TLS, you need to specify the private key for your server and the
certificate chain via
java -jar daml-on-sql-<version>.jar --pem server.pem --crt server.crt.
By default, the Ledger API requires client authentication as well. You can set a
custom root CA certificate used to validate client certificates via
You can change the client authentication mode via
--client-auth none which
will disable it completely,
--client-auth optional which makes it optional
or specify the default explicitly via
To enable certificate revocation checking using the Online Certificate Status
Protocol (OCSP) use
To start the ledger, run:
java -jar daml-on-sql-<version>.jar [options].
To see all the available options, run:
java -jar daml-on-sql-<version>.jar --help.
Daml Driver for PostgreSQL uses the industry-standard Logback for logging. You can read more on how to set it up in the Daml Driver for PostgreSQL CLI reference and the Logback documentation.
The logging infrastructure leverages structured logging as implemented by the Logstash Logback Encoder.
Each logged event carries information about the request being served by the Ledger API server (e.g. the command identifier). When using a traditional logging target (e.g. standard output or rotating files) this information will be part of the log description. Using a logging target compatible with the Logstash Logback Encoder allows to have rich logs with structured information about the event being logged.
Enable and configure reporting¶
To enable metrics and configure reporting, you can use the following command-line interface options:
--metrics-reporter: passing a legal value will enable reporting; the accepted values are as follows:
console: prints captured metrics to standard output
csv://</path/to/metrics.csv>: saves the captured metrics in CSV format at the specified location
graphite://<server_host>[:<server_port>][/<metric_prefix>]: sends captured metrics to a Graphite server. If the port is omitted, the default value
2003will be used. A
metric_prefixcan be specified, causing all metrics to be reported with the specified prefix.
--metrics-reporting-interval: metrics are pre-aggregated within the Daml for PostgreSQL server and sent to the reporter, this option allows the user to set the interval. The formats accepted are based on the ISO-8601 duration format
PnDTnHnMn.nSwith days considered to be exactly 24 hours. The default interval is 10 seconds.
Types of metrics¶
This is a list of type of metrics with all data points recorded for each. Use this as a reference when reading the list of metrics.
An individual instantaneous measurement.
Number of occurrences of some event.
A meter tracks the number of times a given event occurred. The following data points are kept and reported by any meter.
<metric.qualified.name>.count: number of registered data points overall
<metric.qualified.name>.m1_rate: number of registered data points per minute
<metric.qualified.name>.m5_rate: number of registered data points every 5 minutes
<metric.qualified.name>.m15_rate: number of registered data points every 15 minutes
<metric.qualified.name>.mean_rate: mean number of registered data points
An histogram records aggregated statistics about collections of events. The exact meaning of the number depends on the metric (e.g. timers are histograms about the time necessary to complete an operation).
<metric.qualified.name>.mean: arithmetic mean
<metric.qualified.name>.stddev: standard deviation
<metric.qualified.name>.p75: 75th percentile
<metric.qualified.name>.p95: 95th percentile
<metric.qualified.name>.p98: 98th percentile
<metric.qualified.name>.p99: 99th percentile
<metric.qualified.name>.p999: 99.9th percentile
<metric.qualified.name>.min: lowest registered value overall
<metric.qualified.name>.max: highest registered value overall
Histograms only keep a small reservoir of statistically relevant data points to ensure that metrics collection can be reasonably accurate without being too taxing resource-wise.
Unless mentioned otherwise all histograms (including timers, mentioned below) use exponentially decaying reservoirs (i.e. the data is roughly relevant for the last five minutes of recording) to ensure that recent and possibly operationally relevant changes are visible through the metrics reporter.
max values are not affected by the reservoir sampling
You can read more about reservoir sampling and possible associated policies in the Dropwizard Metrics library documentation.
A timer records all metrics registered by a meter and by an histogram, where the histogram records the time necessary to execute a given operation (unless otherwise specified, the precision is nanoseconds and the unit of measurement is milliseconds).
A “database metric” is a collection of simpler metrics that keep track of relevant numbers when interacting with a persistent relational store.
These metrics are:
<metric.qualified.name>.wait(timer): time to acquire a connection to the database
<metric.qualified.name>.exec(timer): time to run the query and read the result
<metric.qualified.name>.query(timer): time to run the query
<metric.qualified.name>.commit(timer): time to perform the commit
<metric.qualified.name>.translation(timer): if relevant, time necessary to turn serialized Daml-LF values into in-memory objects
A “cache metric” is a collection of simpler metrics that keep track of relevant numbers when interacting with an in-memory cache.
These metrics are:
<metric.qualified.name>.hits(counter): the number of cache hits
<metric.qualified.name>.misses(counter): the number of cache misses
<metric.qualified.name>.load_successes(counter): the number of times a new value is successfully loaded into the cache
<metric.qualified.name>.load_failures(counter): the number of times a new value fails to be loaded into the cache
<metric.qualified.name>.load_total_time(timer): the total time spent loading new values into the cache
<metric.qualified.name>.evictions(counter): the number of cache evictions
<metric.qualified.name>.evicted_weight(counter): the total size of the values evicted from the cache
<metric.qualified.name>.size(gauge): the size of the cache
<metric.qualified.name>.weight(gauge): the total size of all values currently in the cache
List of metrics¶
The following is a non-exhaustive list of selected metrics that can be particularly important to track.
A meter. Number of deduplicated commands.
A meter. Number of delayed submissions (submission who have been evaluated to transaction with a ledger time farther in the future than the expected latency).
A meter. Number of commands that have been deemed unacceptable by the interpreter and thus rejected (e.g. double spends)
A timer. Time to fully process a submission (validation, deduplication and interpretation) before it is handed over to the ledger to be finalized (either committed or rejected).
A meter. Number of submissions that are currently being handled by the ledger API server (including validation, deduplication, interpretation, and handing the transaction to the ledger).
A meter. Number of submission that pass validation and are further sent to deduplication and interpretation.
A timer. Time to validate submitted commands before they are fed to the Daml interpreter.
A counter. The capacity of the queue accepting submissions on the CommandService for a given party.
A counter. The number of currently pending submissions on the CommandService for a given party.
A timer. Measures the queuing delay for pending submissions on the CommandService.
A counter. The capacity of the queue tracking completions on the CommandService for a given party.
A counter. The number of currently pending completions on the CommandService for a given party.
A timer. Time spent by the engine fetching the packages of compiled Daml code necessary for interpretation.
A histogram. Number of active contracts fetched for each processed transaction.
A timer. Time to fetch all active contracts necessary to process each transaction.
A timer. Time to fetch each individual active contract during interpretation.
A histogram. Number of contract keys looked up for each processed transaction.
A timer. Time to lookup all contract keys necessary to process each transaction.
A timer. Time to lookup each individual contract key during interpretation.
A meter. Overall number of interpretation retries attempted due to mismatching ledger effective time.
A timer. Time spent interpreting a valid command into a transaction ready to be submitted to the ledger for finalization (includes executing Daml and fetching data).
A meter. Number of commands that are currently being interpreted (includes executing Daml code and fetching data).
A meter. Number of commands that are currently being executed by the Daml engine (excluding fetching data).
This namespace holds a number of interesting metrics about the connection pool used to communicate with the persistent store that underlies the index.
These metrics include:
daml.index.db.connection.sandbox.pool.Wait(timer): time spent waiting to acquire a connection
daml.index.db.connection.sandbox.pool.Usage(histogram): time spent using each acquired connection
daml.index.db.connection.sandbox.pool.TotalConnections(gauge): number or total connections
daml.index.db.connection.sandbox.pool.IdleConnections(gauge): number of idle connections
daml.index.db.connection.sandbox.pool.ActiveConnections(gauge): number of active connections
daml.index.db.connection.sandbox.pool.PendingConnections(gauge): number of threads waiting for a connection
A timer. Time spent persisting deduplication information to ensure the continued working of the deduplication mechanism across restarts.
A database metric. Time spent retrieving a page of active contracts to be served from the active contract service. The page size is configurable, please look at the CLI reference.
A database metric. Time spent retrieving a page of command completions to be served from the command completion service. The page size is configurable, please look at the CLI reference.
A database metric. Time spent retrieving a page of flat transactions to be streamed from the transaction service. The page size is configurable, please look at the CLI reference.
A database metric. Time spent retrieving the current ledger end. The count for this metric is expected to be very high and always increasing as the indexed is queried for the latest updates.
A database metric. Time spent retrieving the ledger identifier.
A database metric. Time spent retrieving a page of flat transactions to be streamed from the transaction service. The page size is configurable, please look at the CLI reference.
A database metric. Load the currently allocated parties so that they are served via the party management service.
A database metric. Time spent loading a package of compiled Daml code so that it is given to the Daml interpreter when needed.
A database metric. Time to load the current entries in the log of configuration entries. Used to verify whether a configuration has been ultimately set.
A database metric. Time to load the current entries in the log of package uploads. Used to verify whether a package has been ultimately uploaded.
A database metric. Load the currently uploaded packages so that they are served via the package management service.
A database metric. Load the currently allocated parties so that they are served via the party service.
A database metric. Time to load the current entries in the log of party allocations. Used to verify whether a party has been ultimately allocated.
A database metric. Time to fetch one contract on the index to be used by the Daml interpreter to evaluate a command into a transaction.
A database metric. Time to fetch the configuration so that it is served via the configuration management service.
A database metric. Time to lookup one contract key on the index to be used by the Daml interpreter to evaluate a command into a transaction.
A database metric. Time to lookup a single flat transaction by identifier to be served by the transaction service.
A database metric. Time spent looking up the ledger effective time of a transaction as the maximum ledger time of all active contracts involved to ensure causal monotonicity.
A database metric. Time to lookup a single transaction tree by identifier to be served by the transaction service.
A database metric. Time spent removing deduplication information after the expiration of the deduplication window. Deduplication information is persisted to ensure the continued working of the deduplication mechanism across restarts.
A database metric. Time spent removing deduplication information after the failure of a command. Deduplication information is persisted to ensure the continued working of the deduplication mechanism across restarts.
A database metric. Time spent persisting a change in the ledger configuration provided through the configuration management service.
A database metric. Time spent persisting a transaction that has been successfully interpreted and is final.
A database metric. Time spent storing a Daml package uploaded through the package management service.
A database metric. Time spent storing party information as part of the party allocation endpoint provided by the party management service.
A database metric. Time spent persisting the information that a given command has been rejected.
A cache metric. Measurements around the optional Daml-LF value translation cache.
Every metrics under this namespace is a timer, one for each service exposed by the Ledger API, in the format:
As in the following example:
Single call services return the time to serve the request, streaming services measure the time to return the first response.
jvm namespace there is a collection of metrics that tracks
important measurements about the JVM that the server is running on, including
CPU usage, memory consumption and the current state of threads.
Daml Ledger Model Compliance¶
Daml Driver for PostgreSQL is tested regularly against the Daml Ledger API Test Tool to verify that the ledger implements correctly the Daml semantics and to check its performance envelope.
On top of bespoke unit and integration tests, Daml Driver for PostgreSQL is thoroughly tested with the Ledger API Test Tool to ensure that the implementation correctly implements the Daml semantics.
These tests check that all the services which are part of the Ledger API behave as expected, with a particular attention to ensure that issuing commands and reading transactions respect the confidentiality and privacy guarantees defined by the Daml Ledger Model.
Furthermore, this implementation is regularly tested to comply with the Daml Ledger Implementation Performance Envelope tests.
In particular, the tests are run to ensure that Daml Driver for PostgreSQL can:
- process transactions as large as 1 MB
- have a tail latency no greater than 1 second when issuing 20 pings
- have a throughput of 20 pings per second
You can read more on performance tests in the documentation of the Ledger API Test Tool.
Replicate performance envelope tests¶
The following setup has been used to run the performance envelope tests:
- PostgreSQL server: a GCP Cloud SQL managed instance using PostgreSQL 12, with 1 vCPU, 3.75 GB of RAM, 250 MB/s of network throughput, a 10 GB SDD HD, 1.2 MB/s of R/W disk throughput, 8 RIOPS and 15 WIOPS, no automatic failover or disk increase, default PostgreSQL 12 configuration.
- Daml Driver for PostgreSQL server: a GCP N1-Standard-1 instance, with 1 vCPU, 3.75 GB of RAM, Ubuntu 20.04 LTS (64 bit), 10 GB boot disk, OpenJDK 1.8.0_242
- Ledger API test tool client: a GCP F1-Micro instance, with 1 shared vCPU, 614 MB of RAM, Ubuntu 20.04 LTS (64 bit), 10 GB boot disk, OpenJDK 1.8.0_242
The three instances were in the same region and availability zone to minimize the latency between the three.
The tests run to evaluate the performance envelope are:
Please refer to the documentation for the Ledger API Test Tool to learn how to run these tests.