Data security has shifted from a secondary concern to the primary objective of cybersecurity programs. Eighty-two percent of breaches target cloud-hosted data, and the explosion of SaaS applications, cloud storage, and AI training datasets means sensitive data is scattered across more locations than ever. Meanwhile, over 80% of the global population is now covered by data privacy law, with 8 new US state privacy laws taking effect in 2025 alone and the Texas Attorney General imposing a $1.375 billion penalty on Google, one of the largest state-enforced privacy penalties ever.
Data Security Posture Management (DSPM) has emerged as the technology category addressing this challenge, projected at 34% CAGR through 2034. This guide covers how to discover, classify, protect, and monitor sensitive data across cloud, SaaS, and on-premises environments.
Where Data Security Gets Hard
Organizations face three interconnected challenges when it comes to protecting their data.
You Don’t Know Where Your Data Is
Shadow data, meaning copies, exports, and derivatives of sensitive data that exist outside managed repositories, grows constantly. Developers copy production data to staging environments for testing. Analysts download sensitive datasets to local machines or personal cloud storage. SaaS applications create data stores that IT doesn’t manage. AI training pipelines aggregate data from multiple sources without classification, and backup systems create additional copies of sensitive data across storage tiers.
Traditional DLP Falls Short
Legacy DLP focused on known data flows (email, web upload, USB) and relied on predefined patterns like regex for credit card numbers. This approach fails when data sits in cloud storage you don’t monitor, when data formats are unstructured (documents, images, chat messages), when data is processed by AI systems that create derivatives, or when employees use approved SaaS tools in unapproved ways.
Privacy Regulations Keep Multiplying
Every new regulation adds requirements for data discovery, classification, access control, retention, and deletion:
| Regulation | Scope | Key Requirements |
|---|---|---|
| GDPR | EU/EEA personal data | Lawful basis, data minimization, right to erasure, 72-hour breach notification |
| CCPA/CPRA | California consumers | Right to know, delete, opt-out of sale, data minimization |
| State privacy laws | 20+ US states by 2026 | Varying requirements for consent, access, deletion, and data protection assessments |
| HIPAA | US health information | Security Rule, Privacy Rule, breach notification, BAAs |
| PCI DSS 4.0 | Payment card data | Encryption, access controls, logging, vulnerability management |
| EU NIS2 | Essential and important entities | Risk management, incident reporting, supply chain security |
| EU CRA | Products with digital elements | Security by design, vulnerability handling, SBOM requirements |
Data Security Posture Management (DSPM)
DSPM provides automated discovery, classification, and risk assessment of sensitive data across cloud environments. Since May 2023, seven DSPM startups have been acquired by major security vendors (IBM, Rubrik, Palo Alto Networks, CrowdStrike, Tenable, Netskope, and Proofpoint), signaling that DSPM is becoming an expected platform capability rather than a standalone category.
How DSPM Works
DSPM platforms start with discovery, using agentless scanning of cloud storage (S3, Azure Blob, GCS), databases, SaaS applications, and data warehouses to find all data stores. Next comes classification, where automated systems categorize data by sensitivity, type (PII, PHI, PCI, credentials, IP), and regulatory applicability. Risk assessment identifies misconfigurations, excessive access, unencrypted storage, and compliance violations. Finally, monitoring continuously tracks data access patterns, movement, and exposure changes.
DSPM Vendor Landscape
| Vendor | Strength | Context |
|---|---|---|
| Cyera | Largest standalone DSPM, seeking $400M at $9B valuation | Acquired Trail Security ($162M) for DLP expansion |
| Varonis | Deepest SaaS and on-premises coverage, access governance | Top Forrester data security platform scores |
| Wiz (Google) | CNAPP context correlates data exposure with attack paths | Part of the $32B Google acquisition |
| Securiti | AI-driven data intelligence with privacy automation | Multi-cloud and SaaS data discovery |
If you need CNAPP integration with data exposure analysis, look at Wiz. For fast discovery across hybrid environments, Cyera fits well. Deep access governance and SaaS coverage points to Varonis. And if you need privacy automation alongside data security, Securiti covers that ground.
Data Classification
Effective data security requires knowing what data you have and how sensitive it is. Manual classification doesn’t scale, so automated classification is the only viable approach at enterprise scale.
Classification Framework
| Level | Label | Examples | Required Controls |
|---|---|---|---|
| 1 | Public | Marketing materials, public docs | Basic access controls |
| 2 | Internal | Business documents, internal communications | Authentication, no public sharing |
| 3 | Confidential | Customer PII, financial records, HR data | Encryption, access logging, DLP, retention controls |
| 4 | Restricted | Credentials, encryption keys, trade secrets, PHI | HSM/vault storage, strict access control, audit trails, data loss prevention |
Automated Classification Approaches
Pattern-based classification uses regex and keyword matching for structured data like credit card numbers, SSNs, and email addresses. NLP-based classification applies natural language processing for unstructured data such as contracts, medical records, and legal documents. ML-based classification trains machine learning classifiers on labeled datasets for complex classification decisions. Context-based classification looks at location, access patterns, and metadata, so a file in the HR folder is likely HR-sensitive.
Cloud providers offer native classification tools worth considering. AWS Macie handles automated S3 data discovery and classification. Azure Purview provides cross-cloud data governance and classification. GCP DLP API offers content inspection and classification for structured and unstructured data.
Making Classification Work
Classify data at creation or ingestion, because classification becomes harder as data proliferates. Apply classification labels that persist with the data using tools like Microsoft Information Protection or Google DLP. Reassess classification periodically since data sensitivity can change over time. And integrate classification into CI/CD pipelines for developer-created data stores.
Data Loss Prevention (DLP)
DLP enforces data handling policies by monitoring, detecting, and blocking unauthorized data movement.
DLP Deployment Points
| Layer | What It Protects | Examples |
|---|---|---|
| Endpoint DLP | Local copies, USB transfers, screenshots, clipboard | Microsoft Purview DLP, Symantec DLP, Forcepoint |
| Network DLP | Outbound transfers, email attachments, web uploads | Zscaler DLP, Netskope DLP, Palo Alto DLP |
| Cloud DLP | Cloud storage, SaaS applications, databases | AWS Macie, Azure Purview, Google DLP, CASB-based DLP |
| Email DLP | Outbound email with sensitive data | Proofpoint DLP, Mimecast, Microsoft DLP |
DLP Implementation Strategy
Start in monitor-only mode to observe data flows and refine policies before blocking. Focus on high-sensitivity data first, particularly credentials, payment card data, and health records. Tune aggressively because DLP false positives cause users to find workarounds, undermining the program. Integrate with classification so DLP policies reference classification labels, not just content patterns. And measure effectiveness by tracking incidents prevented, false positive rate, and policy exception requests.
Privacy Engineering
Data Minimization
Collect and retain only the data you need. Audit data collection across all applications to understand what is collected versus what is actually used. Implement purpose limitation so data collected for one purpose should not be repurposed without consent. Set retention policies and enforce automatic deletion because data that doesn’t exist can’t be breached. Use anonymization or pseudonymization for analytics and testing instead of real data.
Privacy by Design
Integrate privacy into system architecture from the beginning. Implement granular consent collection and enforcement for personal data. Build automated processes for access requests, deletion requests, and data portability. Conduct privacy impact assessments (PIAs) for new systems processing personal data. For international data transfers, implement appropriate safeguards like Standard Contractual Clauses, Binding Corporate Rules, or adequacy decisions.
US State Privacy Law Compliance
With 20+ state privacy laws in effect or taking effect by 2026, organizations need a scalable approach. Map to the strictest requirements because if you comply with CPRA (California), you likely meet most other states’ requirements. Implement universal opt-out mechanisms by supporting the Global Privacy Control (GPC) signal. Maintain a data processing inventory documenting what data you collect, why, where it’s stored, and who has access. Automate data subject requests since manual processing doesn’t scale as regulations multiply. And monitor for new laws because state privacy legislation is introduced every legislative session.
Encryption and Key Management
Encryption Strategy
| Data State | Encryption Method | Key Consideration |
|---|---|---|
| At rest | AES-256, cloud-native encryption | Who controls the encryption keys? (Service-managed vs. customer-managed) |
| In transit | TLS 1.3 | Disable TLS 1.0/1.1; enforce certificate validation |
| In use | Confidential computing, homomorphic encryption | Emerging technology, use for highest-sensitivity workloads |
Key Management
Use Hardware Security Modules (HSMs) for the most sensitive keys like root CAs and payment processing. Use cloud KMS (AWS KMS, Azure Key Vault, GCP Cloud KMS) with customer-managed keys for cloud data. Implement key rotation on a defined schedule, annually at minimum for encryption keys. Separate key management from data storage so the team managing keys should not have access to the encrypted data. Maintain key escrow procedures for business continuity because lost keys mean lost data.
Getting Started
Start by discovering your data. Deploy DSPM or cloud-native classification tools to find sensitive data across cloud, SaaS, and on-premises environments. Then classify by sensitivity using automated classification with a 4-level framework. Assess risk by identifying unencrypted sensitive data, excessive access, and misconfigured storage. Implement DLP starting in monitor mode on the highest-sensitivity data categories. Enforce encryption using customer-managed keys for confidential and restricted data. Automate privacy by implementing consent management, data subject request automation, and retention enforcement. Finally, monitor continuously because DSPM and DLP should operate continuously, not periodically.