Illuminating the Dark Corners of AI: Extracting Private Data from AI Models and Vector Embeddings

No ratings

Presented at DEF CON 33 by

This talk explores the hidden risks in apps leveraging modern AI systems—especially those using large language models (LLMs) and retrieval-augmented generation (RAG) workflows. We demonstrate how sensitive data, such as personally identifiable information (PII) and social security numbers, can be extracted through real-world attacks. We’ll demonstrate model inversion attacks targeting fine-tuned models, and embedding inversion attacks on vector databases among others. The point is to show how PII scanning tools fail to recognize the rich data that lives in these systems and how much of privacy disaster these AI ecosystems really are.