Skip to main content

AI Security

Securing your AI app

Skilled attackers can manipulate generative AI applications in unexpected ways, leading to the disclosure of sensitive or harmful information. This can happen through direct prompt injection, model poisoning, or by planting malicious content within the application’s data in indirect, subtle ways. To secure your generative AI applications, it’s essential to focus on all key areas: user prompts, underlying models and data, and response outputs.

Our tutorials explore risks and countermeasures to reduce the attack surface of your generative AI applications.

  • We start by applying filtering and transformation to user inputs and context data to mitigate the risks of prompt manipulation.
  • With response guardrails, we sanitize outputs to protect your applications and users in changing conditions without the need for model retraining.
  • Implementing robust access controls makes it safe for your application to interact with enterprise data.
  • Logging and monitoring of prompts, responses, and system events, including LLM details, enables attribution and accountability for outcomes that may require further analysis.
  • Moving into the model, we can disassemble, sanitize, and reassemble the training data to reduce the chance of sensitive or malicious information being incorporated into the system.

Securing prompts and responses

Protecting your AI app begins with understanding the boundaries of your prompt and response systems.

User interactions with your generative AI application can pose significant risks to your organization. Skilled attackers could manipulate conversation contexts, while well-meaning users might unintentionally input sensitive data. LLM responses can contain sensitive or harmful content due to unsanitized training data, model overfitting, data poisoning, or insufficient controls in Retrieval-Augmented Generation (RAG) systems. Safeguards within the model and system instructions may not always keep pace with evolving prompt manipulation tactics and other attacks.

Given the non-deterministic nature of LLMs, eliminating unexpected behavior is challenging. However, you can improve security and compliance by controlling the data that enters and exits your system at inference time.

Authentication and authorization

Generative AI applications can enhance their content and functionality by enabling LLMs to use external tools, plugins, or additional context provided by Retrieval-Augmented Generation (RAG).

However, this additional data may include information that should be accessible only to authorized users. Unlike public generative AI apps, enterprise AI applications require Identity and Access Management (IAM) controls to protect sensitive information. Without these controls, malicious users could gain unauthorized access to confidential data, such as financial records, forward-looking documents, and personnel files.

Access control can depend on data sensitivity levels, ownership, intended audience, user groups, and entitlements. Security boundaries can be applied at different stages of data consumption, depending on the business model and available access controls. Implementing these boundaries semantically can be complex or impractical. Instead, defining and enforcing authorization policies outside of LLM-driven functionality brings us back to a familiar cybersecurity framework centered on user roles, group memberships, relationships to data objects, and attributes existing in your application.

If there are only a few sensitive data categories, separate databases can be used to manage access. However, users often require access to multiple categories, resulting in "many-to-many" relationships that separate databases alone cannot handle.

Alternatively, sensitive data can be tagged with access attributes at ingestion time using metadata.

Finally, user-specific resources can be accessed at inference time, allowing for permission checks and user consent before retrieving data from sources like Google Drive.

Depending on application requirements, data sources, authorization models, performance considerations, and environment, different approaches may be more suitable.

The OWASP's Top 10 for LLMs and Generative AI Apps addressed in these tutorials:

Attribution and accountability

When your AI application does not support your business (or worse, damages it) you need to know who did what, when, where, and in which context. To analyze undesirable outcomes, breaches, or threats, you must capture the essential state of the system, such as:

  • LLM model
  • Authentication events
  • Prompts and responses
  • Data vectors and ingestion events
  • Agent state and messages

Your audit trail should be tamper-proof, integrate seamlessly with existing monitoring systems, support automated log retrieval via APIs, and provide SDKs for easy integration.

  • The Attribution and Accountability in LLM Apps tutorial explains how to log application and runtime event data during inference in a Python LangChain application using Pangea's Secure Audit Log.

    Capturing details about user interactions with your AI applications in a tamper-proof, centralized way enables threat analysis, helps you monitor application activity, reproduces undesirable behavior, and ensures compliance with regulations.

Was this article helpful?

Contact us