RAG Architecture and Data Exposure Risks
This demonstration walks through building a customer-facing chatbot using AWS Bedrock and Anthropic's Claude model with Retrieval Augmented Generation (RAG). The fictional SafeMarch Home Appliances company uses RAG to augment LLM responses with product documentation stored in S3 buckets, enabling the chatbot to answer domain-specific questions about home appliances. However, the demo reveals a critical security gap: when a knowledge base inadvertently includes sensitive documents alongside intended content, users can craft prompts that extract confidential information—in this case, board resolutions detailing a $100 million acquisition plan that the model incorrectly attributes to the user manual.
DSPM Detection and Remediation Workflow
Zscaler DSPM's AI-SPM capabilities provide visibility into which AI models can access sensitive data and through what paths. The dashboard surfaces exposure by model and data classification type, allowing security teams to trace the access path from AWS Bedrock knowledge bases through to specific S3 objects containing regulated data like GLBA-classified financial statements. Beyond reactive investigation, DSPM generates proactive alerts when knowledge bases contain sensitive data, describing potential threats and enabling automated remediation through Jira tickets, ServiceNow integration, or custom workflows. The presenter previews upcoming coverage for shadow AI scenarios where organizations deploy AI workloads outside managed services.