Skip to main content

Recipes

Overview

AI Guard recipes are reusable security checks and transformations designed for specific scenarios in your AI application's data flow. You can specify a recipe name in an API request to apply it to the provided content.

A recipe is a collection of configurable detectors, where each detector identifies a particular type of threat, such as personally identifiable information (PII), malicious entities, prompt injection, or toxic content, and applies a specified action.

A detector may consist of a single component that applies a single action. Some detectors may have multiple rules, where each rule detects and acts on a specific data type within the broader threat category. For example, the Confidential and PII detector can identify and apply actions to credit card numbers, email addresses, locations, and other sensitive data types.

Recipe settings

You can manage AI Guard recipes on the Recipes page in your Pangea User Console .

The AI Guard Recipes configuration page with the AI Guard Sandbox used in the Pangea User Console

  • Click the + Recipe button to add a new recipe.

  • Use the triple-dot button next to an existing recipe to:

    • Update the recipe's display name and description.

    • Clone the recipe.

    • Delete the recipe.

    • Manage the recipe's redaction settings.

      Currently, you can enable deterministic (reproducible) Format Preserving Encryption (FPE) as a redaction method. For details, see the Format Preserving Encryption (FPE) section under Redact actions below.

AI Guard Sandbox

The AI Guard Sandbox is an LLM chat-based UI for testing AI Guard recipes as you edit them.

  • User/System (dropdown) - Select either the "user" or "system" role to populate corresponding messages.
  • Enable LLM processing prompt (star button) - Enable or disable LLM processing. When disabled, only AI Guard processing is applied, and the message is not sent to the LLM.
  • Reset chat history (time machine button) - Clear chat history to test new input scenarios.
  • View request preview (< > button) - Preview the request that will be sent to AI Guard APIs.
  • View full response (< > button in the response window) - See the complete JSON response, including details about detections made and actions taken.
note

If your test prompt is blocked, the message history displayed in the response will not be carried over to the next prompt.

Pre-configured recipes for common use cases

The default AI Guard configuration includes four recipes tailored for common use cases. The recipe name to be used in an API call is displayed next to its display name.

  • User Input Prompt (pangea_prompt_guard) - Processes initial user input. By default, this recipe blocks prompt injection.
  • Ingestion (e.g. RAG) (pangea_ingestion_guard) - Analyzes data ingested in a Retrieval-Augmented Generation (RAG) system. By default, this recipe blocks prompt injection and certain malicious entities, confidential and PII data, and known secrets.
  • Pre LLM (pangea_llm_prompt_guard) - Validates final input submitted to an LLM, after adding context (for example, in a RAG system), but before the LLM receives the prompt. By default, this recipe blocks prompt injection and redacts certain confidential and PII data and known secrets.
  • LLM Response (pangea_llm_response_guard) - Filters and sanitizes AI-generated responses. By default, this recipe redacts certain confidential and PII data.
  • Agent Pre Plan (pangea_agent_pre_plan_guard) - Ensures that no prompt injections can influence or alter the agent's plan for solving a task. This recipe helps prevent any manipulation that could modify the agent’s approach or introduce unintended risks before task execution begins. By default, this recipe blocks prompt injection.
  • Agent Pre Tool (pangea_agent_pre_tool_guard) - Prevents malicious entities or sensitive information from being passed to the tool. This recipe mitigates the risk of exposing sensitive data and ensures harmful input is not sent to external tools or APIs. By default, this recipe blocks certain malicious entities and confidential and PII data.
  • Agent Post Tool (pangea_agent_post_tool_guard) - Prevents malicious entities or sensitive information from being present in the tool’s or agent's output before it is returned to the caller, passed to the next tool, or forwarded to another agent. By default, this recipe blocks certain malicious entities and confidential and PII data.

The out-of-the-box recipes serve as examples and starting points for your custom configuration. You can view each recipe’s functionality, the detectors it includes, and how they are configured in your Pangea User Console. From there, you can modify existing configurations or create new recipes to meet your security requirements.

Detectors

Within each recipe, you can enable, configure, and disable individual detectors.

AI Guard provides the following detectors:

Prompt Injection

Detects attempts to manipulate AI prompts with adversarial inputs. Supported actions:

  • Report Only
  • Block

Prompt Hardening (coming soon)

Strengthens prompts to resist manipulation and unauthorized modifications.

Malicious Entity

Detects harmful references such as malicious IPs, URLs, and domains. You can define individual rules for each of the three supported malicious entity types (IP Address, URL, Domain) and apply specific actions for each rule:

  • Report Only
  • Defang
  • Block
  • Disabled

Confidential and PII

Detects personally identifiable information (PII) and other confidential data, such as email addresses, credit cards, government-issued IDs, etc. You can add individual rules for each detection type, such as Email Address, US Social Security Number, Credit Card, etc., and apply specific actions to each rule:

  • Block
  • Replacement
  • Mask (<****>)
  • Partial Mask (****xxxx)
  • Report Only
  • Hash
  • Format Preserving Encryption

Secret and Key Entity

Detects sensitive credentials like API keys, encryption keys, etc. You can add individual rules for each of the supported secret types and apply specific actions to each rule:

  • Block
  • Replacement
  • Mask (<****>)
  • Partial Mask (****xxxx)
  • Report Only
  • Hash
  • Format Preserving Encryption

Profanity and Toxicity

Detects offensive, inappropriate, or toxic language. Supported actions:

  • Report Only
  • Block

Additionally, you can adjust the confidence threshold for detections, making the detector more or less sensitive.

Language

Detects spoken language and applies language-based security policies. You can create a list of supported languages and select an action for language detection:

  • Allow List
  • Block List
  • Report Only

Gibberish

Detects nonsensical or meaningless text to filter out low-quality or misleading inputs. Supported actions:

  • Report Only
  • Block

Additionally, you can adjust the confidence threshold for detections, making the detector more or less sensitive.

Code

Detects attempts to insert executable code into AI interactions. Supported actions:

  • Report Only
  • Block

Negative Sentiment

Detects text expressing negative emotions, such as anger, frustration, or dissatisfaction, to assess potential risks or harmful intent. Supported actions:

  • Report Only
  • Block

Additionally, you can adjust the confidence threshold for detections, making the detector more or less sensitive.

Self-Harm and Violence

Detects mentions of self-harm, violence, or dangerous behaviors. Supported actions:

  • Report Only
  • Block

Additionally, you can adjust the confidence threshold for detections, making the detector more or less sensitive.

Topic (coming soon)

Detects content related to restricted or disallowed topics.

Competitors (coming soon)

Detects mentions of competing brands or entities.

Custom Entity

Define multiple rules to detect specific text patterns or sensitive terms and apply specific actions for each rule:

  • Block
  • Replacement
  • Mask (<****>)
  • Partial Mask (****xxxx)
  • Report Only
  • Hash
  • Format Preserving Encryption

Actions

Actions associated with detectors may transform the submitted text by redacting or encrypting the detected rule matches. Blocking actions may prevent subsequent detectors from running. Detection may also result in no changes to the input or processing and only be reported.

All results of processing by AI Guard are included in the API response. Learn more about the response attributes on the AI Guard APIs documentation page.

Requests to AI Guard APIs are logged by the service. You can inspect the logs on the service's Activity Log page in your Pangea User Console .

The following actions are currently supported across different detectors:

Block

Blocking a detection results in reporting the blocked action, and if any action is blocked, the top-level blocked key in the API response is set to true. This indicates that the content returned from AI Guard should not be processed further by your application.

A blocking action can also stop execution and prevent some detectors from running.

Block all except

The Block all except option in the Language detector explicitly allows inputs only in the specified language(s).

Defang

Malicious IP addresses, URLs, or domains are modified to prevent accidental clicks or execution while preserving their readability for analysis. This helps reduce the risk of inadvertently accessing harmful content. For example, a defanged IP address may look like: 47[.]84[.]32[.]175.

Disabled

Prevents processing of a particular rule or entity type.

Report Only

The detection is reported in the API response, but no action is taken on the detected content.

Redact actions

Redact actions transform the detected text via configurable rules. For each rule, you can select a specific action and/or edit it by clicking on the rule name or the triple-dot button.

Use the Save button to apply your changes.

In the Test Rules pane on the right, you can validate your rules using different data types.

AI Guard Recipe rule with editing options for Partial Masking in the Pangea User Console

AI Guard Recipe rule edit form for Replacing in the Pangea User Console

Replacement

Replaces the rule-matching data with the Replacement Value selected in the rule's action.

Mask (<****>)

Replaces the rule-matching text with asterisks.

Partial Mask (****xxxx)

Partially replaces the rule-matching text with asterisks or a custom character. In the Edit Rule dialog, you can configure partial masking using the following options:

  • Masking Character - Specify the character for masking (for example, #).
  • Masking Options
    • Unmasked from left - Define the number of starting characters to leave unmasked. Use the input field or the increase/decrease UI buttons.
    • Unmasked from right - Define the number of ending characters to leave unmasked. Use the input field or the increase/decrease UI buttons.
  • Characters to Ignore - Specify characters that should remain unmasked (for example, -).
Hash

Replaces the detected text with hashed values. To enable hash redaction, click Enable Hash Redaction and create a salt value, which will be saved as a Vault secret.

Format Preserving Encryption (FPE)

Format Preserving Encryption (FPE) preserves the format of redacted data while making it recoverable. For details on the FPE redaction method, visit the Redact documentation pages.

In AI Guard recipe settings, you can enable Deterministic Format Preserving Encryption (FPE) in the Manage Redact Settings dialog, accessed via the triple-dot menu next to the recipe name. From there, you can create or select a custom tweak value for the FPE redaction method.

note

A tweak is an additional input used alongside the plaintext and encryption key to enhance security. It makes it harder for attackers to use statistical methods to break the encryption. Different tweak values produce different outputs for the same encryption key and data. The original tweak value used for encryption is required to decrypt the data.

Using a custom tweak ensures that the same original value produces the same encrypted value on every request, making it deterministic. If no tweak value is provided, a random string is generated, and the encrypted value will differ on each request.

Whether you use a custom or randomly generated tweak, it is returned in the API response as part of the fpe_context attribute, which you can use to decrypt and recover the original value.

Learn how to decrypt FPE-redacted values on the APIs documentation page.

Was this article helpful?

Contact us