Skip to main content

Tamperproofing

Learn about Tamperproofing

Tamperproofing is a key feature of the Secure Audit Log Service. The goal of the tamperproofing feature is to provide proof of the following:

  1. Events sent to the Secure Audit Log service are intact and have not been modified, altered, or otherwise changed en route or at rest.
  2. The entire Audit Log is untampered - no historical events have been deleted, and no new events have been backdated or inserted into the log history.

The Tamperproofing feature uses Merkle Trees to track every event that is sent to the Pangea Secure Audit Log Service. Merkle Trees have been used for decades, more recently as the basis of many blockchain implementations. They provide an efficient method of verifying that data belongs to a specified data set.

Create tamperproof log events

To create a tamperproof log event:

  1. Canonicalize the event JSON.
  2. Sign the canonicalized event. [Optional]
  3. Add the event to the audit log.
  4. Publish a new root hash.

Event canonicalization

When a log event is sent to Secure Audit Log, the event JSON must first be canonicalized. By default, JSON does not guarantee order or consistent representation of the data it contains. This presents challenges to cryptographic operations like hashing to validate data consistency. Hashing non-canonicalized JSON could result in a different hash being generated each time a different ordering, spacing, and so on., is returned. This makes JSON's default behavior unsuitable for consistency validation via a calculated hash. To resolve this problem JSON Canonicalization Scheme (JCS) was developed. JCS defines a scheme for consistently storing and recalling JSON data, which is why all audit events are canonicalized before being hashed and added to the Merkle Tree.

When using the /log endpoint, the verbose mode can be used to see the canonicalized JSON that the Audit service uses to generate an event hash.

Additionally, the SDK includes a method to canonicalize the event data on the client side. This can be useful for validating that the canonicalized JSON that was hashed matches what was originally sent.

Sign the canonicalized event [Optional]

Event signing is a process that uses public key cryptography to sign an event envelope digitally. The event envelope contains all of the event's details supplied to the API - everything in the event dictionary. When digitally signing an event, a public and private key pair must be generated. The private key is used to sign a hash of the canonicalized JSON event digitally. The public key is supplied with each signed event so that it can be retrieved and used to verify the integrity and authenticity of the event upon event retrieval.

Add the event to the Audit Log

When events are added to the Audit Log, a new root hash is not immediately published to the immutable ledger. During this interim period, the unpublished root hashes can be supplied to the /log endpoint's prev_root parameter so their tamperproof status may be verified (see validate the consistency proof). After an hour or 10,000 events (whichever comes first), a new root hash will be published to the immutable public ledger on ARweave.

Publish a new root hash

After one hour or 10,000 events (whichever comes first) a new root hash and consistency proof are published to Arweave. Arweave is an immutable public block weave that Pangea uses as a public repository of all historical root hashes.

The data published to Arweave is represented as follows:

{
"published_at: <timestamp>,
"root_hash": <root_hash>,
"size": 3,
"consistency_proof": [
"x: <node_hash>,r: <proof_hash>"
]
}
note

Pangea also keeps a local copy of all published root hashes and consistency proofs as backup.

Publish to Arweave

To guarantee that logs are tamperproof, root hashes must be published to an immutable third party with a verifiable history. This allows customers to have evidence that Pangea's root hash has not been recalculated to compensate for any modified or missing audit data.

Validate tamperproofing

Ensuring the tamperproofing validation is a vital security measure for Pangea Audit Logs. This process guarantees that the data within the log remains unchanged and authentic. By validating tamperproofing, Pangea can maintain transparency and accountability in its operations.

To validate tamperproofing:

  1. Validate the membership proof.
  2. Validate the consistency proof.
  3. Validate the event signature.

Validate the membership proof

When log events are returned via the /search and /results endpoints, the event hash, membership proof, and root hash can optionally be returned with each result. The format of the membership proof is as follows:

[l:hash, r:hash, ...]

The l and r keys indicate on which side of the hashing operation the proof hashes should be used. To properly validate the membership proof, one should start with the event hash and combine the hashes on the appropriate side of the sha256 hashing operation until all values in the membership proof have been used. The resulting hash can then be compared against the current root hash, returned as root_hash with the search results. If the root hash matches the calculated hash, the record can be verified as untampered.

note

Some customers may want to calculate the event hash themselves instead of trusting the returned event hash. It is important to first canonicalize the event JSON before calculating the event hash.

Validate the consistency proof

note

The SDK provides tools for you to perform all validation of the consistency proofs for unpublished and published root hashes.

The consistency proof is used to validate that a more recent root hash is a continuation of a previous root hash. A consistency proof, root hash, tree name, and tree size are always returned by the /search endpoint for events with a published root hash.

The /log endpoint may also return consistency proofs. This occurs when a previously unpublished root is provided to the prev_root parameter of the /log endpoint. To obtain a root hash from the /log endpoint, the verbose parameter must be set to true. This will return a new, unpublished root hash and membership proof with every event logged. By providing a previous unpublished root, data lineage from root to root can be proven with consistency proofs prior to a new root being published to Pangea's immutable public ledger.

The consistency proof is represented as a list of nodes and their associated proofs. The number of nodes depends on the size of the old tree from which the previous root hash was formed. If the tree was complete, meaning it had 2n nodes, only one node, the root hash, will be returned; otherwise, multiple nodes will be returned.

The consistency proof is represented as such:

    "consistency_proof": [
0: "x: <node_hash, r: <proof_hash>",
...
]

There are two key steps to verifying a consistency proof.

  1. Verify that the previously published hash can be obtained with the node hashes. If only one node hash (represented by the x key) is returned, this amounts to comparing the node hash provided to the previously published root hash. If multiple node hashes are returned, each node hash must be chained to obtain the previously published root.

You can obtain the previous root hash by making a query to the Arweave graphql API. A search by tree_size and tree_name tags can be performed to find the published Arweave data. If looking for a previous root hash, the tree size to search for would be the returned tree size minus one. For example, if a tree size of five was returned, then the previous root hash would be found by searching for a tree size of four.

An example Arweave graphql query is as follows:

{
transactions(
tags: [
{ name: "tree_size", values: [<tree_size>] }
{
name: "tree_name"
values: [
"<tree name>"
]
}
]
){
edges {
node {
id
tags {
name
value
}
}
}
}
}
  1. Verify that each node hash returned by the consistency proof is a member of the current tree. This can be done by chaining the hashes included in the membership proofs of each node hash and validating that the result matches the returned root hash.

Validate the event signature

note

The SDK provides tools for you to perform all validation event signatures.

When returning signed Audit events via the /search endpoint, a public key will be returned with the event in addition to a signature of the canonicalized JSON event. The public key can be used to verify the signature and, thus, the event's authenticity. For extra verification, the private key (not provided by the /search) endpoint can be used to rebuild the canonicalized event signature. A comparison of the event signatures can then be performed to ensure that they both match.

Was this article helpful?

Contact us