Mining the Index: Uncovering Sensitive Data in Public ChatGPT Histories via Google Search

Joey Melo
Joey Melo
Aug 1, 2025

Recent revelations have shed light on a new reality for ChatGPT users: Google has been actively indexing shared ChatGPT conversation histories. This means that shared discussions, potentially containing sensitive personal or professional information, are being made publicly discoverable through search engines.

The mechanism behind this unexpected indexing is surprisingly straightforward. When a user chooses to share a ChatGPT conversation, a unique URL is generated. According to OpenAI, “shared links are not enabled to show up in public search results on the internet by default.” However, this crucial step is often overlooked by users who are distracted or unaware, especially when the URL is intended for sharing with a limited audience.

The consequences of this indexing are significant, and the attack surface for privacy breaches only seems to grow.

  • Exposure of Personal Identifiable Information (PII): Users might unknowingly share details like names, addresses, phone numbers, or other PII during a ChatGPT interaction, which then becomes searchable.

  • Academic and Research Data: Students and researchers discussing sensitive topics, experiments, or unpublished findings with ChatGPT could find their work prematurely exposed.

  • Source Code Exposure: Developers or engineers discussing proprietary algorithms, software vulnerabilities, or internal code structures with ChatGPT could inadvertently make this information public.

  • Internal Infrastructure Exposure: Details about a company's network architecture, server configurations, or security protocols shared with ChatGPT could lead to significant security risks if exposed.

What has been exposed?

I got curious and wanted to see what information and resources are out there on this topic, so I decided to dig in.

Private conversations

Some conversations included somewhat private information, such as home renovation ideas, recipes, or informal brainstorming.

One humorous exchange involved a user inquiring about microwaving a metal fork, eliciting a highly sarcastic response from ChatGPT.

Personal Identifiable Information

Many conversations were found to contain sensitive personal information such as emails, phone numbers, addresses, and names.

More personally identifiable information (PII) was discovered, particularly within the group of individuals seeking resume writing tips.

Source code

Some developers shared their backend tree view or other backend configuration.

Others revealed complete source code for scripts and automation.

Stack traces (internal systems information)

Using ChatGPT for debugging and error handling is common. However, users often share errors without realizing these may contain sensitive information about the underlying code or technology.

Conclusion

This situation underscores a fundamental challenge in the rapidly evolving landscape of AI and online data: the often-blurred lines between private interaction and public discoverability. Users assume a certain level of privacy when interacting with AI models, and the indexing of shared histories by search engines creates a potential for unintended and unwelcome exposure.

Companies, in particular, could benefit from configurable guardrails, which can prevent the inadvertent exposure of sensitive company information, such as proprietary code, internal infrastructure details, and confidential project data, through AI conversations. This proactive approach is crucial for safeguarding intellectual property and maintaining a strong security posture in the age of widespread AI adoption.

Note: At the time of writing, OpenAI has disabled the discoverability feature, and Google has stopped indexing shared conversation histories.

Get updates in your inbox and subscribe to our newsletter

background landmass

Outsmart our AI. Play now

Play our AI Escape Room Challenge to test your prompt injection skills.

Register Now