Skip to main content

Redaction Rules

Understand the Redaction rules

About Ruleset

A ruleset is a set of redaction rules organized by category. Currently, only one default ruleset is available. However, as the coming soon feature of creating custom rules is made available, users will be able to create new rules and organize them into rulesets as required.

Redaction Rules

Redaction rules define how data will be matched and subsequently replaced by the Redact service. There are 24 out-of-the-box redaction rules comprised of two different types.

NLP-based rules

NLP-based rules use natural language processing and trained models to identify matching fields from the provided data. The details of these rules cannot be viewed when clicking view rule.

Regex-based rules

Regex-based rules use regular expressions to identify matching fields from the provided data. Clicking view rule will expose the specific regex or set of regexes that are used to identify matching data.

Rule Details

In the case of regex-based rules, the details of the rules can be viewed and inspected. The rule is broken down into several key areas.

Name

This is the name of the rule as it appears in the ruleset list.

Description

This is a description of the rule and its intended purpose.

Matches

This is where matching regex(es) are defined describing how the rule identifies matches.

Context values

In some cases, rule match strength may be bolstered by language appearing near the matched data. An example may be matching a 9-digit number with the words "SSN" near it - this would more strongly indicate that an SSN had been correctly identified.

Default Redaction Threshold

When matches are identified, they are provided a match score. Match scores are calculated by the regex that was used to identify the data, as well as the context values that occurred near the matching data. This match score is on a 0-1 scale. The Redaction Threshold defines at which score redaction occurs.

For instance, if a Redaction Threshold is set to .6, and a rule matches with a score of .5, the matching data will not be redacted.

Replacement Value

If a Redaction Method of Replacement is used, the text in the replacement value is the text that will replace the matched data.

Redaction Method

The Redaction Method defines how data will be replaced as matches are identified. These are the available redaction methods:

Replacement

This redaction method will replace the matching data with the "replacement value" defined in the rule.

Mask

This redaction method will replace the matching data with asterisks or a custom character.

Partial Masking

This redaction method will partially replace your text with asterisks or a custom character. Through masking options, you can control the following:

  1. Specify the number of characters from the left or right that can be unmasked from the text.
  2. What is left unmasked by ignoring certain characters

The following fields are presented when you choose the Partial Masking redaction method:

  • Masking Character: Enter a character to mask the text. For example, enter #.

  • Masking Options:

    • Unmasked from left: Enter the number (or click the increase/decrease UI button to input a number) of starting characters to leave unmasked on matched values.
    • Unmasked from right: Enter the number (or click the increase/decrease UI button to input a number) of ending characters to leave unmasked on matched values.
  • Characters to ignore: Enter the characters to ignore from masking. For example, enter -. Now, click the Add button.

    Click the Save button.

On the right pane, click the Test Rules button. You can see the results displayed in the Redacted Text tab or Details tab.

Detect Only

This redaction method tracks the text as marked, incrementing the redaction count, and updating the redaction report. It does not perform redaction on matching text.

Hash

When you click Configure Redaction Hash, this feature allows you to efficiently manage the redaction process using hashing. It includes the salting of values to enhance security against attacks on stored data. The salt values are securely managed as Vault secrets. To enable the Hash redaction method, click Generate Salt Secret.

important

If the Vault feature is already enabled, you will encounter the Generate Salt Secret button. However, if the Vault feature is not enabled, you will instead see the Enable Vault & Generate Salt button.

When using this method, the matching text will be replaced with hashed values. It utilizes the SHA 256 algorithm for hashing the redacted values. The configuration option is used to generate a secret value that aids in salting the hashes. This process minimizes the occurrence of reverse lookups, ensuring better security.

Choose the redaction method

In some cases, the context for what was redacted may be helpful. In these cases, Replacement is a good choice, as the replacement value will indicate the type of data that was redacted.

However, using replacement, by nature, does leak some information that existed before redaction. If zero-knowledge redaction is required, then Mask should be used, as it will provide no indication as to the data that was redacted.

Test the redact rules in Pangea Console

The Pangea Console supports the testing of any enabled or disabled rules. To test a rule:

  1. In the Pangea Console , go to the Rulesets page under Redact
  2. Click Test Rules.

Redact Rulesets page

A dialog will appear with various options for testing rules.

Test rules dialog

  1. Select a rule or multiple rules you’d like to test from the dropdown menu.
  2. Add any text you want to test against in the input field below the dropdown. The data should be associated with the rules option you selected in the dropdown. For example, if you chose "Credit Card", you can add any credit card number along with other text to test the rule and ensure it's working as expected. Once you add the data, click Test Rules.
  3. The test yields a response:
    • Redacted Data - The response represents the data you entered in the input field with the sensitive portion (i.e. Credit Card number) redacted. Redacted text
    • Details - The JSON response sent by the redact API. Details

Test the redact rules in API Reference

You can also test out redact rules using the interactive Redact API Reference. You'll need to copy rule name information from the Pangea Console into the input fields in the API Reference. Follow the steps below:

  1. In the Pangea Console , go to the Rulesets page under Redact.

Redact Rulesets page

  1. Click on a set of rules. A dialog with configuration details will appear.
  2. Decide which rule(s) you want to test and use the copy icon to copy the rule name as a string onto your clipboard. You can test more than one rule at a time.

Copy rule short name to clipboard

  1. Go to the Redact API Reference
  2. Enter data into the parameter fields (or choose to Load Sample Data)
  3. In the rules parameter, paste the string you copied onto your clipboard. You can enter an array of strings - and in doing so test various rules at once.

Rules in API Reference

In this example, "email address" was used as the redaction rule. The API request includes a string of text that contains an email address. You can see in the JSON response that "email address" has been redacted.

curl -sSLX POST 'https://redact.'"$PANGEA_DOMAIN"'/v1/redact' \
-H 'Authorization: Bearer '"$PANGEA_REDACT_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"text":"My name is Dennis Nedry and my email is you.didnt.say.the.magic.word@gmail.com","debug":false,"rules":["EMAIL_ADDRESS"]}'
{
"request_id": "prq_63ooica2rmk5nmv3cg6yp3iohb4ukugp",
"request_time": "2023-02-03T18:56:37.236505Z",
"response_time": "2023-02-03T18:56:37.393045Z",
"status": "Success",
"summary": "Success. Redacted 1 item(s) from text",
"result": {
"redacted_text": "My name is Dennis Nedry and my email is <EMAIL_ADDRESS>",
"count": 1
}
}

Was this article helpful?

Contact us