Querying LLMs with Pangea and GitHub Copilot Chat

Michael Weinberger

Sep 5, 2024

In the ever-evolving landscape of AI code generation and developer assistance tools, providing users with accurate and relevant information quickly is crucial. To help ensure that developers can swiftly and easily add security features to their products, Pangea has been developing and improving our GitHub Copilot Extension. This extension enabled developers to use the GitHub Copilot Chat natural language interface to query the Pangea extension and ask it how to implement specific security features. The extension will provide documentation and code samples that developers can use as a reference to expeditiously implement Pangea services in their application.

When a developer asks a question about how to add a Pangea service to their project, such as “How do I add authentication to my Python project”, the Pangea Copilot Extension performs some techniques to ensure that the response is accurate and contains a valid code sample. Because LLMs have limited context windows, and because we can't constantly be re-training new models, we’ve embraced a powerful approach known as Retrieval-Augmented Generation (RAG) to enhance our LLM queries with accurate context. This method not only improves the efficiency of our AI assistant but also ensures users get precise and contextually relevant code examples. Let’s delve into how this process works and why it’s a game-changer for interacting with our SDKs.

Understanding the Basics: Embeddings

At the heart of our approach are embeddings. An embedding is essentially a numerical representation of text that captures its semantic meaning. In simpler terms, it translates pieces of text—such as code snippets or documentation—into vectors of numbers. These vectors are designed to reflect the underlying concepts and relationships within the text.

For instance, consider two different examples from our SDKs: one related to IP Intelligence and another related to Audit. Each of these examples will have its own unique embedding because the semantic content and context differ. This differentiation is crucial for accurately retrieving relevant information.

How Embeddings Enhance Retrieval-Augmented Generation

The magic of RAG lies in how we leverage these embeddings to make our extension’s responses more accurate. When a user poses a question, our system generates an embedding for that query. This embedding serves as a numeric representation of the question’s meaning.

Next, we calculate the mathematical distance between this query embedding and the embeddings of all available SDK examples. In simple terms, we’re measuring how closely related the question is to each piece of content we have.

The embeddings with the shortest distance to the query are considered the most relevant. This means that the closer the distance, the more pertinent the SDK example is to the user’s question. By identifying these closest matches, we can then use just that context in the LLM query to generate the most relevant examples to the user. This contextually driven retrieval significantly enhances the quality and relevance of the information provided.

Practical Application in Pangea SDKs

Here’s a practical look at how this process unfolds:

Generating Embeddings: We start by creating embeddings for all the examples and documentation in our SDK. Each example, whether it’s a piece of code or a tutorial, is transformed into a vector that represents its semantic content.
Handling User Queries: When a user asks a question, the extension generates an embedding for that query. This embedding acts as a query vector, capturing the essence of the user’s request.
Calculating Relevance: We then compute the distance between the query embedding and each example embedding. This involves finding the closest matches based on vector proximity.
Providing Contextual Answers: Finally, the most relevant SDK examples, as determined by their proximity to the query vector, are used as context in a prompt that is sent to the LLM for dynamic generation. This ensures that the information is not only accurate but also highly relevant to their specific question.

Why RAG Matters

Retrieval-Augmented Generation offers several key benefits:

Enhanced Accuracy: By focusing on the semantic similarity between queries and examples, we provide more accurate and contextually relevant content to generate better responses.
Efficiency: Users get faster query responses due to the more efficient query context.
Complexity: Significantly more content can be available as searchable content to be used as context, allowing for more specific information relevant to each query to be used rather than more general content used for every query.

Looking Ahead

As we continue to refine our RAG approach, we aim to further enhance the accuracy and efficiency of our extension for better code examples and content. By continuously improving our embedding techniques and retrieval algorithms, we’re committed to providing users with the best possible support when interacting with the Pangea Copilot Extension.

To access the Pangea GitHub Copilot extension please visit: https://github.com/marketplace/pangea-cyber

To watch a demonstration video of the Pangea extension, please watch: https://l.pangea.cloud/TM0i8GO

To try Pangea or to learn more, sign up for a free developer account at https://pangea.cloud/

AI llm github copilot GitHub RAG pangea