AI Security
Securing your AI app
Skilled attackers can manipulate generative AI applications in unexpected ways, leading to the disclosure of sensitive or harmful information. This can happen through direct prompt injection, model poisoning, or by planting malicious content within the application’s data in indirect, subtle ways. To secure your generative AI applications, it’s essential to focus on all key areas: user prompts, underlying models and data, and response outputs.
Our tutorials explore risks and countermeasures to reduce the attack surface of your generative AI applications.
- We start by applying filtering and transformation to user inputs and context data to mitigate the risks of prompt manipulation.
- With response guardrails, we sanitize outputs to protect your applications and users in changing conditions without the need for model retraining.
- Implementing robust access controls makes it safe for your application to interact with enterprise data.
- Logging and monitoring of prompts, responses, and system events, including LLM details, enables attribution and accountability for outcomes that may require further analysis.
- Moving into the model, we can disassemble, sanitize, and reassemble the training data to reduce the chance of sensitive or malicious information being incorporated into the system.
Securing prompts and responses
Protecting your AI app begins with understanding the boundaries of your prompt and response systems.
User interactions with your generative AI application can pose significant risks to your organization. Skilled attackers could manipulate conversation contexts, while well-meaning users might unintentionally input sensitive data. LLM responses can contain sensitive or harmful content due to unsanitized training data, model overfitting, data poisoning, or insufficient controls in Retrieval-Augmented Generation (RAG) systems. Safeguards within the model and system instructions may not always keep pace with evolving prompt manipulation tactics and other attacks.
Given the non-deterministic nature of LLMs, eliminating unexpected behavior is challenging. However, you can improve security and compliance by controlling the data that enters and exits your system at inference time.
-
The LLM Prompt & Response Guardrails tutorial explains how to leverage Pangea's composable security services to effectively monitor and control the exposure of sensitive or harmful content in a Python LangChain application. We’ll start with a simple application and progressively integrate Pangea’s services to secure interactions and protect your models, users, and organization.
The OWASP's Top 10 for LLMs and Generative AI Apps addressed in this tutorial:
Authentication and authorization
Generative AI applications can enhance their content and functionality by enabling LLMs to use external tools, plugins, or additional context provided by Retrieval-Augmented Generation (RAG).
However, this additional data may include information that should be accessible only to authorized users. Unlike public generative AI apps, enterprise AI applications require Identity and Access Management (IAM) controls to protect sensitive information. Without these controls, malicious users could gain unauthorized access to confidential data, such as financial records, forward-looking documents, and personnel files.
Access control can depend on data sensitivity levels, ownership, intended audience, user groups, and entitlements. Security boundaries can be applied at different stages of data consumption, depending on the business model and available access controls. Implementing these boundaries semantically can be complex or impractical. Instead, defining and enforcing authorization policies outside of LLM-driven functionality brings us back to a familiar cybersecurity framework centered on user roles, group memberships, relationships to data objects, and attributes existing in your application.
If there are only a few sensitive data categories, separate databases can be used to manage access. However, users often require access to multiple categories, resulting in "many-to-many" relationships that separate databases alone cannot handle.
Alternatively, sensitive data can be tagged with access attributes at ingestion time using metadata.
Finally, user-specific resources can be accessed at inference time, allowing for permission checks and user consent before retrieving data from sources like Google Drive.
Depending on application requirements, data sources, authorization models, performance considerations, and environment, different approaches may be more suitable.
-
The Identity and Access Management in LLM apps tutorial demonstrates using Pangea's services to add authentication and authorization to a Python LangChain application. This enables you to identify users, manage their access, and securely add proprietary context to their interactions with the LLM. We’ll begin with a RAG application and implement a basic authorization model based on metadata created at ingestion time.
-
The Centralized Access Management in RAG Apps tutorial covers integrating enterprise documents from various sources for semantic search in a Python LangChain RAG application. It shows how you can leverage Pangea's AuthZ to manage permissions from the original data sources centrally and enable potential access expansion to other users.
Private information used as context for LLM prompts may come from multiple sources. By capturing and maintaining original user permissions as AuthZ policies, you can implement unified access control over data from different sources and share it securely with registered users. AuthZ also serves as an abstraction layer, allowing data from any source to be shared with any application user by adding necessary authorization policies. Moreover, a single AuthZ schema can provide consistent access control across multiple applications.
-
The Secure Cloudflare RAG Chatbot App on Cloudflare Pages, Workers AI, and Vectorize tutorial guides you through setting up and deploying an example Retrieval-Augmented Generation (RAG) web chat application on Cloudflare Pages, secured with Pangea's services:
- Unified control over RAG data is achieved using AuthZ .
- User authentication is implemented via AuthN .
- LLM inputs and outputs are analyzed, sanitized, and blocked if necessary, leveraging Prompt Guard and AI Guard .
- Application logs are captured and secured in Secure Audit Log .
This tutorial primarily focuses on maintaining AuthZ policies based on the permissions applied to the original documents used in RAG, similar to the Centralized Access Management in RAG Apps tutorial. However, unlike the Python example, this app is a fully featured Next.js web application that can serve as a robust template for your future projects.
The OWASP's Top 10 for LLMs and Generative AI Apps addressed in these tutorials:
Attribution and accountability
When your AI application does not support your business (or worse, damages it) you need to know who did what, when, where, and in which context. To analyze undesirable outcomes, breaches, or threats, you must capture the essential state of the system, such as:
- LLM model
- Authentication events
- Prompts and responses
- Data vectors and ingestion events
- Agent state and messages
Your audit trail should be tamper-proof, integrate seamlessly with existing monitoring systems, support automated log retrieval via APIs, and provide SDKs for easy integration.
-
The Attribution and Accountability in LLM Apps tutorial explains how to log application and runtime event data during inference in a Python LangChain application using Pangea's Secure Audit Log.
Capturing details about user interactions with your AI applications in a tamper-proof, centralized way enables threat analysis, helps you monitor application activity, reproduces undesirable behavior, and ensures compliance with regulations.
Was this article helpful?