Research

The problem is documented.
The incidents are real.

AI data leakage is not theoretical. It is happening at organizations you recognize, with consequences that are measurable and growing. Every incident and study below links to its original source.

Documented incidents
What happened. Who it happened to.

These are confirmed, reported incidents where employees at recognized organizations accidentally exposed sensitive data through AI tools. Each links to the original reporting.

Samsung April 2023

Engineers pasted semiconductor source code and meeting transcripts into ChatGPT

Three separate incidents in 20 days. Samsung banned all generative AI company-wide and launched disciplinary investigations. The source code pasted to ChatGPT was used to train the model and cannot be retrieved.

Read original report: Gizmodo, April 6, 2023 →
JPMorgan Chase February 2023

Bank restricted ChatGPT globally over compliance concerns around sensitive client data

JPMorgan restricted ChatGPT for all global staff due to compliance concerns with third-party software handling of sensitive client information. The move reflected the banking sector's recognition that employees were routinely feeding client data into public AI tools.

Read original report: CNN, February 22, 2023 →
US Legal Profession July 2024

ABA issued first formal ethics guidance requiring informed client consent before using AI with client data

The American Bar Association's Formal Opinion 512 found that entering client information into self-learning AI tools may waive attorney-client privilege and requires explicit informed consent. Standard boilerplate in engagement letters is not sufficient.

Read original guidance: ABA Formal Opinion 512, July 29, 2024 →
CISA August 2025

Acting director of US cybersecurity agency uploaded FOUO-marked government documents to public ChatGPT

CISA's own automated monitoring systems flagged the uploads. A DHS internal review was launched. The incident demonstrates that even security-trained senior officials upload sensitive documents to public AI tools. Technical controls, not awareness training, are the only reliable safeguard.

Read original report: CSO Online, February 2026 →
The research
What the data says.

Independent research from credible organizations measuring the scale of AI data leakage across enterprise environments.

39.7%
of AI interactions
AI interactions involve sensitive data, up from 11% in 2023
Based on analysis of billions of real data movements across 222 companies. Frontier organizations interact with over 300 GenAI tools. Employees input sensitive data into AI tools on average once every three days.
Cyberhaven Labs AI Adoption and Risk Report 2026 →
68%
of organizations
Have experienced data leaks linked to AI tool use
Surveyed across 404 CISOs and security leaders in the US and UK. Only 23% have formal security policies to address AI-related data leakage despite 90% expressing confidence in their security measures.
Metomic State of Data Security Report 2025 (via Security Magazine) →
97%
of breached orgs
Lacked proper AI access controls at the time of their breach
IBM studied AI-specific security incidents for the first time in its annual report. 13% of organizations reported breaches of AI models or applications. 60% of those breaches led to compromised data, and one in five was caused by shadow AI.
IBM Cost of a Data Breach Report 2025 →
77%
of employees
Have pasted company information into AI tools
82% of those used personal accounts that bypass enterprise controls entirely, placing confidential data completely outside security team visibility. 77% of all online LLM access is to ChatGPT.
LayerX Enterprise AI and SaaS Data Security Report 2025 (via eSecurity Planet) →
Sources and further reading

All claims on this site and in the accordion on the home page are drawn from the primary sources listed below. Every source is publicly available. Where a secondary source is cited, the underlying primary research is identified.

AI platform data practices
OpenAI. "How your data is used to improve model performance." Updated 2025. Confirms free and Plus tier conversations may be used for model training unless users opt out.
openai.com/policies/how-your-data-is-used-to-improve-model-performance/ →
OpenAI. Privacy Policy. Updated February 2026. Covers data retention, human review practices, and third-party data sharing.
openai.com/policies/row-privacy-policy/ →
Enterprise AI data exposure
Cyberhaven Labs. 2026 AI Adoption and Risk Report. February 5, 2026. Based on tracking of 7 million workers across enterprise environments. Finds 39.7% of all AI interactions involve sensitive data, 32.3% of ChatGPT usage is on personal accounts, and 82% of widely-used GenAI tools carry elevated risk.
cyberhaven.com/press-releases/cyberhaven-2026-ai-adoption-risk-report →
Cyberhaven Labs. Q2 2024 AI Adoption and Risk Report. May 2024. Based on 3 million workers. Finds 27.4% of corporate data put into AI is sensitive (up from 10.7% a year prior), and corporate data flowing into AI tools grew 485% between March 2023 and March 2024.
cyberhaven.com/blog/shadow-ai →
LayerX Security. Enterprise AI and SaaS Data Security Report 2025. Finds 18% of enterprise employees paste data into AI tools, with over 50% of those paste events including corporate information. Reported by eSecurity Planet, October 2025.
esecurityplanet.com/news/shadow-ai-chatgpt-dlp/ →
Breach costs and organizational risk
IBM. Cost of a Data Breach Report 2025. Finds breaches involving shadow AI average $4.63 million, $670,000 more than standard incidents. One in five breached organizations suffered an incident caused by shadow AI. 97% of breached organizations lacked proper AI access controls.
ibm.com/reports/data-breach →
Metomic. State of Data Security Report 2025. Surveys 404 CISOs and security leaders in the US and UK. Finds 68% of organizations have experienced data leaks linked to AI tool use, and only 23% have formal security policies addressing AI-related leakage. Reported by Security Magazine.
securitymagazine.com →
Legal and professional obligations
American Bar Association. Formal Opinion 512 on Generative AI. July 29, 2024. Requires informed client consent before using confidential client information in self-learning AI tools. Finds standard boilerplate engagement letters are insufficient. ABA Model Rule 1.1 requires lawyers to understand the data handling of any AI tool they use.
americanbar.org formal opinion 512 →
Documented platform incidents
wald.ai. "ChatGPT Data Leaks and Security Incidents (2023-2026)." February 2026. Comprehensive timeline of documented incidents including the March 2023 Redis bug (chat histories exposed), 2025 share-link indexing exposure, and third-party credential theft affecting over 225,000 accounts.
wald.ai/blog/chatgpt-data-leaks-and-security-incidents →