Data Security and Privacy: DSPM, Classification, and Compliance

Data security has shifted from a secondary concern to the primary objective of cybersecurity programs. Eighty-two percent of breaches target cloud-hosted data, and the explosion of SaaS applications, cloud storage, and AI training datasets means sensitive data is scattered across more locations than ever. Meanwhile, over 80% of the global population is now covered by data privacy law, with 8 new US state privacy laws taking effect in 2025 alone and the Texas Attorney General imposing a $1.375 billion penalty on Google, one of the largest state-enforced privacy penalties ever.

Data Security Posture Management (DSPM) has emerged as the technology category addressing this challenge, projected at 34% CAGR through 2034. This guide covers how to discover, classify, protect, and monitor sensitive data across cloud, SaaS, and on-premises environments.

Where Data Security Gets Hard

Organizations face three interconnected challenges when it comes to protecting their data.

You Don’t Know Where Your Data Is

Shadow data, meaning copies, exports, and derivatives of sensitive data that exist outside managed repositories, grows constantly. Developers copy production data to staging environments for testing. Analysts download sensitive datasets to local machines or personal cloud storage. SaaS applications create data stores that IT doesn’t manage. AI training pipelines aggregate data from multiple sources without classification, and backup systems create additional copies of sensitive data across storage tiers.

Traditional DLP Falls Short

Legacy DLP focused on known data flows (email, web upload, USB) and relied on predefined patterns like regex for credit card numbers. This approach fails when data sits in cloud storage you don’t monitor, when data formats are unstructured (documents, images, chat messages), when data is processed by AI systems that create derivatives, or when employees use approved SaaS tools in unapproved ways.

Privacy Regulations Keep Multiplying

Every new regulation adds requirements for data discovery, classification, access control, retention, and deletion:

Regulation	Scope	Key Requirements
GDPR	EU/EEA personal data	Lawful basis, data minimization, right to erasure, 72-hour breach notification
CCPA/CPRA	California consumers	Right to know, delete, opt-out of sale, data minimization
State privacy laws	20+ US states by 2026	Varying requirements for consent, access, deletion, and data protection assessments
HIPAA	US health information	Security Rule, Privacy Rule, breach notification, BAAs
PCI DSS 4.0	Payment card data	Encryption, access controls, logging, vulnerability management
EU NIS2	Essential and important entities	Risk management, incident reporting, supply chain security
EU CRA	Products with digital elements	Security by design, vulnerability handling, SBOM requirements

Data Security Posture Management (DSPM)

DSPM provides automated discovery, classification, and risk assessment of sensitive data across cloud environments. Since May 2023, seven DSPM startups have been acquired by major security vendors (IBM, Rubrik, Palo Alto Networks, CrowdStrike, Tenable, Netskope, and Proofpoint), signaling that DSPM is becoming an expected platform capability rather than a standalone category.

How DSPM Works

DSPM platforms start with discovery, using agentless scanning of cloud storage (S3, Azure Blob, GCS), databases, SaaS applications, and data warehouses to find all data stores. Next comes classification, where automated systems categorize data by sensitivity, type (PII, PHI, PCI, credentials, IP), and regulatory applicability. Risk assessment identifies misconfigurations, excessive access, unencrypted storage, and compliance violations. Finally, monitoring continuously tracks data access patterns, movement, and exposure changes.

DSPM Vendor Landscape

Vendor	Strength	Context
Cyera	Largest standalone DSPM, seeking $400M at $9B valuation	Acquired Trail Security ($162M) for DLP expansion
Varonis	Deepest SaaS and on-premises coverage, access governance	Top Forrester data security platform scores
Wiz (Google)	CNAPP context correlates data exposure with attack paths	Part of the $32B Google acquisition
Securiti	AI-driven data intelligence with privacy automation	Multi-cloud and SaaS data discovery

If you need CNAPP integration with data exposure analysis, look at Wiz. For fast discovery across hybrid environments, Cyera fits well. Deep access governance and SaaS coverage points to Varonis. And if you need privacy automation alongside data security, Securiti covers that ground.

Data Classification

Effective data security requires knowing what data you have and how sensitive it is. Manual classification doesn’t scale, so automated classification is the only viable approach at enterprise scale.

Classification Framework

Level	Label	Examples	Required Controls
1	Public	Marketing materials, public docs	Basic access controls
2	Internal	Business documents, internal communications	Authentication, no public sharing
3	Confidential	Customer PII, financial records, HR data	Encryption, access logging, DLP, retention controls
4	Restricted	Credentials, encryption keys, trade secrets, PHI	HSM/vault storage, strict access control, audit trails, data loss prevention

Automated Classification Approaches

Pattern-based classification uses regex and keyword matching for structured data like credit card numbers, SSNs, and email addresses. NLP-based classification applies natural language processing for unstructured data such as contracts, medical records, and legal documents. ML-based classification trains machine learning classifiers on labeled datasets for complex classification decisions. Context-based classification looks at location, access patterns, and metadata, so a file in the HR folder is likely HR-sensitive.

Cloud providers offer native classification tools worth considering. AWS Macie handles automated S3 data discovery and classification. Azure Purview provides cross-cloud data governance and classification. GCP DLP API offers content inspection and classification for structured and unstructured data.

Making Classification Work

Classify data at creation or ingestion, because classification becomes harder as data proliferates. Apply classification labels that persist with the data using tools like Microsoft Information Protection or Google DLP. Reassess classification periodically since data sensitivity can change over time. And integrate classification into CI/CD pipelines for developer-created data stores.

Data Loss Prevention (DLP)

DLP enforces data handling policies by monitoring, detecting, and blocking unauthorized data movement.

DLP Deployment Points

Layer	What It Protects	Examples
Endpoint DLP	Local copies, USB transfers, screenshots, clipboard	Microsoft Purview DLP, Symantec DLP, Forcepoint
Network DLP	Outbound transfers, email attachments, web uploads	Zscaler DLP, Netskope DLP, Palo Alto DLP
Cloud DLP	Cloud storage, SaaS applications, databases	AWS Macie, Azure Purview, Google DLP, CASB-based DLP
Email DLP	Outbound email with sensitive data	Proofpoint DLP, Mimecast, Microsoft DLP

DLP Implementation Strategy

Start in monitor-only mode to observe data flows and refine policies before blocking. Focus on high-sensitivity data first, particularly credentials, payment card data, and health records. Tune aggressively because DLP false positives cause users to find workarounds, undermining the program. Integrate with classification so DLP policies reference classification labels, not just content patterns. And measure effectiveness by tracking incidents prevented, false positive rate, and policy exception requests.

Privacy Engineering

Data Minimization

Collect and retain only the data you need. Audit data collection across all applications to understand what is collected versus what is actually used. Implement purpose limitation so data collected for one purpose should not be repurposed without consent. Set retention policies and enforce automatic deletion because data that doesn’t exist can’t be breached. Use anonymization or pseudonymization for analytics and testing instead of real data.

Privacy by Design

Integrate privacy into system architecture from the beginning. Implement granular consent collection and enforcement for personal data. Build automated processes for access requests, deletion requests, and data portability. Conduct privacy impact assessments (PIAs) for new systems processing personal data. For international data transfers, implement appropriate safeguards like Standard Contractual Clauses, Binding Corporate Rules, or adequacy decisions.

US State Privacy Law Compliance

With 20+ state privacy laws in effect or taking effect by 2026, organizations need a scalable approach. Map to the strictest requirements because if you comply with CPRA (California), you likely meet most other states’ requirements. Implement universal opt-out mechanisms by supporting the Global Privacy Control (GPC) signal. Maintain a data processing inventory documenting what data you collect, why, where it’s stored, and who has access. Automate data subject requests since manual processing doesn’t scale as regulations multiply. And monitor for new laws because state privacy legislation is introduced every legislative session.

Encryption and Key Management

Encryption Strategy

Data State	Encryption Method	Key Consideration
At rest	AES-256, cloud-native encryption	Who controls the encryption keys? (Service-managed vs. customer-managed)
In transit	TLS 1.3	Disable TLS 1.0/1.1; enforce certificate validation
In use	Confidential computing, homomorphic encryption	Emerging technology, use for highest-sensitivity workloads

Key Management

Use Hardware Security Modules (HSMs) for the most sensitive keys like root CAs and payment processing. Use cloud KMS (AWS KMS, Azure Key Vault, GCP Cloud KMS) with customer-managed keys for cloud data. Implement key rotation on a defined schedule, annually at minimum for encryption keys. Separate key management from data storage so the team managing keys should not have access to the encrypted data. Maintain key escrow procedures for business continuity because lost keys mean lost data.

Getting Started

Start by discovering your data. Deploy DSPM or cloud-native classification tools to find sensitive data across cloud, SaaS, and on-premises environments. Then classify by sensitivity using automated classification with a 4-level framework. Assess risk by identifying unencrypted sensitive data, excessive access, and misconfigured storage. Implement DLP starting in monitor mode on the highest-sensitivity data categories. Enforce encryption using customer-managed keys for confidential and restricted data. Automate privacy by implementing consent management, data subject request automation, and retention enforcement. Finally, monitor continuously because DSPM and DLP should operate continuously, not periodically.