Threat hunting is the proactive, iterative search for adversaries that have evaded existing security controls. Unlike reactive alert-driven investigation, hunting assumes the adversary is already present and seeks to find evidence of their activity. CrowdStrike’s 2025 reports documented a sharp rise in cloud intrusions, with China-linked adversaries responsible for a significant share of the increased activity, reinforcing that automated detection alone is insufficient.

Why Threat Hunting Matters

Traditional security operations rely on alerts generated by detection rules. This approach has fundamental limitations.

Alert fatigue overwhelms SOC teams with false positives, with the average SOC processing 11,000+ alerts per day. Detection gaps allow novel techniques and living-off-the-land tactics to bypass signature and rule-based detection. Dwell time means attackers operate undetected for a median of 10 days (CrowdStrike 2025 Global Threat Report) before detection. A reactive posture puts defenders in a position of always playing catch-up against adversaries who innovate faster than detection rules update. Cloud blind spots emerge because traditional hunting focused on endpoints and network while cloud control plane and identity attacks require new approaches.

Threat hunting addresses these limitations by proactively searching for indicators of compromise and adversary behaviors that automated systems miss.

Hunting Process

1. Hypothesis Generation

Every hunt begins with a hypothesis about adversary behavior. Strong hypotheses are specific, testable, and based on evidence or intelligence.

Sources for hypotheses include threat intelligence (reports on active campaigns, actor TTPs, and industry-specific targeting), MITRE ATT&CK v18 (technique coverage gaps in your detection rules since ATT&CK v18 from late 2025 expanded coverage for Kubernetes, CI/CD pipelines, cloud databases, and supply chain attacks), environmental knowledge (understanding of your crown jewels, attack surface, and architectural weaknesses), anomaly analysis (unusual patterns in baseline data such as new processes, unexpected network connections, or authentication anomalies), incident lessons (post-incident findings from your organization or peers that indicate techniques not currently detected), and AI agent activity (MITRE ATLAS from October 2025 added 14 new attack techniques targeting AI agents and generative AI systems).

Example hypotheses:

  • “An adversary is using valid cloud credentials obtained through phishing to enumerate our S3 buckets from an unfamiliar IP range”
  • “Attackers are persisting in our Kubernetes environment through backdoored container images in our private registry”
  • “A compromised service account is being used for lateral movement between our production AWS accounts”

2. Tool and Data Selection

Identify the data sources and tools needed to test your hypothesis. The quality of your hunt is limited by the quality and completeness of your telemetry.

Data sources:

DomainData SourcesKey Fields
EndpointEDR telemetry, Sysmon, Windows Event LogsProcess creation, file modification, registry changes, network connections
NetworkNetFlow, Zeek, DNS logs, proxy logsSource/dest IP, ports, protocols, domains, bytes transferred
IdentityActive Directory, Entra ID, Okta, cloud IAM logsAuthentication events, role changes, MFA status, session tokens
CloudCloudTrail, Azure Activity Log, GCP Audit LogsAPI calls, resource modifications, IAM changes, data access
ApplicationWeb server logs, database audit logs, API gateway logsRequest patterns, authentication failures, query anomalies
EmailMail flow logs, phishing detection platformsSender analysis, URL/attachment indicators, delivery status

Tools include SIEM platforms (Splunk, Microsoft Sentinel, Google Chronicle, Elastic Security) for log analysis and correlation, EDR/XDR (CrowdStrike Falcon, SentinelOne, Microsoft Defender) for endpoint investigation, network analysis tools (Zeek, Wireshark, Stratoshark from Sysdig 2025) for cloud-native network analysis, threat intelligence platforms (MISP, OpenCTI, Recorded Future, Mandiant Advantage), and AI-assisted hunting tools (Charlotte AI, Purple AI, Copilot for Security) for natural language querying.

3. Investigation

Execute your hunt using appropriate techniques.

Query-based hunting searches for specific indicators or behavioral patterns:

# Example: Hunt for suspicious cloud API calls from unusual locations
index=cloudtrail eventName=GetSecretValue OR eventName=GetParameter
| iplocation sourceIPAddress
| stats count by userIdentity.arn, sourceIPAddress, Country, City
| where Country != "United States"
| sort -count

Stack counting identifies outliers by counting occurrences since rare events in high-volume data often indicate adversary activity:

# Find rare parent-child process relationships
index=edr
| stats count by parent_process_name, process_name
| where count < 5
| sort count

Frequency analysis detects beaconing and periodic communication patterns:

# Identify potential C2 beaconing by connection regularity
index=proxy
| timechart span=1m count by dest_domain
| foreach * [eval jitter_<<FIELD>> = abs(<<FIELD>> - avg(<<FIELD>>))]

Visualization graphs relationships to identify suspicious patterns. Network connection graphs identify unusual communication paths. Process trees map execution chains from initial access to objectives. Authentication flow graphs detect lateral movement patterns. Cloud resource access graphs identify privilege escalation paths.

4. Documentation

Document findings regardless of whether you find malicious activity. A hunt that finds nothing is still valuable because it either confirms the hypothesis is false or reveals visibility gaps.

Hunt report template:

  • Hypothesis: What adversary behavior were you looking for?
  • Data sources used: What telemetry was queried?
  • Queries executed: Exact queries for reproducibility
  • Results and conclusions: What was found? Was the hypothesis confirmed or refuted?
  • Visibility gaps: What data was missing or insufficient?
  • Recommendations: New detections, logging improvements, or configuration changes

Hunting Use Cases

Persistence Mechanisms

Hunt for unauthorized persistence since adversaries establish multiple persistence methods to survive remediation.

Endpoint persistence indicators include scheduled tasks created by unusual parent processes (not svchost.exe or schtasks.exe launched by administrators), services with suspicious executable paths (temp directories, user profiles, unusual locations), registry run keys modified by non-standard processes, WMI event subscriptions (a favorite of APT groups), and startup folder additions and modifications.

Cloud persistence indicators include IAM users or access keys created outside normal provisioning workflows, Lambda/Cloud Functions created or modified by unexpected principals, backdoor roles with trust policies allowing cross-account access, OAuth applications with excessive permissions added to identity providers, and container images modified in private registries.

Lateral Movement

Detect movement between systems since adversaries rarely achieve their objectives from the initial foothold.

Traditional lateral movement indicators include remote service creation (PsExec patterns with service EXE in Windows\Temp), WMI remote process execution (wmiprvse.exe spawning unexpected children), Windows Remote Management / PowerShell remoting from unexpected sources, RDP connections from servers or non-standard workstations, and pass-the-hash and pass-the-ticket indicators (NTLM type 3 logons from unexpected sources).

Cloud lateral movement indicators include role assumption chains (AssumeRole calls creating unexpected cross-account access), service account token usage from unexpected IP addresses, Kubernetes pod-to-pod lateral movement via service account tokens, and cross-cloud credential usage (AWS credentials used from GCP, or vice versa).

Data Exfiltration

Identify data theft before it’s complete.

Look for large outbound transfers to uncommon destinations (baseline normal egress first), cloud storage downloads from new or unexpected IP addresses, DNS tunneling patterns (high-entropy subdomains, excessive DNS query volume), encrypted channels to recently registered domains, cloud storage replication to external accounts, and Git repository cloning of sensitive repositories by service accounts.

Identity-Based Hunting

Identity attacks are the fastest-growing threat vector, so hunt for credential abuse and identity manipulation.

Watch for impossible travel (authentication from geographically distant locations within impossible timeframes), MFA fatigue (repeated MFA push notifications followed by approval), token theft (session cookies used from different IP addresses or user agents), privilege escalation (IAM policy changes, role assignments, or group membership modifications outside change management), service account abuse (service accounts authenticating interactively or from unexpected systems), and password spray (low-volume authentication failures distributed across many accounts).

Building a Hunting Program

Team Structure

Three models exist, each with trade-offs:

ModelProsCons
Dedicated teamDeep expertise, consistent outputExpensive, limited headcount
RotationBroader coverage, skill development for all analystsInconsistent quality, training overhead
HybridCore expertise + broad participationRequires strong leadership to coordinate

For most organizations, the hybrid model works best: 1-3 dedicated hunters supported by SOC analysts who rotate through hunting assignments quarterly.

AI-Augmented Hunting

AI investigation assistants are transforming threat hunting in 2025-2026.

Natural language querying through Charlotte AI, Purple AI, and Copilot for Security allows hunters to ask questions in plain English rather than writing complex queries. Automated hypothesis generation lets AI suggest hunt hypotheses based on threat intelligence, environmental context, and detection gap analysis. Pattern recognition through AI excels at identifying subtle statistical anomalies across large datasets that human analysts miss. Investigation acceleration allows AI to summarize large result sets, correlate findings across data sources, and draft hunt reports.

There are important limitations to understand. AI assistants can hallucinate findings, so always verify AI-generated analysis against raw data. AI lacks the contextual understanding of your environment that experienced hunters have. Adversaries are beginning to use AI to generate more sophisticated evasion techniques. AI tools work best as force multipliers for skilled hunters, not replacements.

Measuring Success

Track hunting program metrics to justify investment and improve quality:

MetricDescriptionWhat It Tells You
Hunts completed per periodVolume of hunting activityProgram activity level
Threats discoveredConfirmed malicious activity foundDirect program value
Detection rules createdHunts converted to automated detectionsLasting security improvements
Coverage improvementsMITRE ATT&CK techniques with detection coverage gainedReduced detection gaps
Visibility gaps identifiedMissing or insufficient data sources discoveredLogging investment priorities
Mean time to huntAverage duration from hypothesis to conclusionOperational efficiency
False positive reductionDetection rules refined through hunting insightsSOC efficiency improvement

Operationalizing Findings

Turn hunting discoveries into lasting improvements.

Create detection rules for discovered techniques so that every successful hunt produces at least one new detection. Improve logging coverage for blind spots identified during hunts. Update incident response playbooks with new attack patterns. Share intelligence with peers, ISACs, and the community (Sigma rules, YARA rules, blog posts). Feed findings into threat modeling for development teams.

Tools of the Trade

Commercial

CrowdStrike Falcon OverWatch provides managed hunting with Charlotte AI, covering endpoint, cloud, and identity. Microsoft Defender Threat Hunting offers KQL-based hunting with Copilot for Security natural language support. SentinelOne Singularity uses Purple AI-powered hunting across the Singularity Data Lake. Elastic Security provides open detection rules, EQL hunting, and ML-powered anomaly detection. Google Chronicle offers YARA-L hunting with Gemini AI at Google-scale data volumes.

Open Source

Sigma rules provide a vendor-agnostic detection format convertible to any SIEM query language. YARA rules enable pattern matching for file and memory analysis. Atomic Red Team provides technique-level adversary simulation for validating hunts and detections. HELK (Hunting ELK) is an Elasticsearch-based hunting platform with Jupyter notebooks. Velociraptor is an endpoint visibility and forensic collection tool. Falco provides runtime security for Kubernetes and cloud workloads (eBPF-based). Stratoshark from Sysdig (2025) is essentially Wireshark for cloud, offering packet-level cloud traffic analysis.

Getting Started

  1. Assess your data by inventorying available telemetry and identifying gaps, since you can’t hunt in data you don’t collect
  2. Start with known-bad by hunting for publicly documented threat actor TTPs relevant to your industry
  3. Build foundational queries by developing reusable queries for common techniques (persistence, lateral movement, credential access)
  4. Map to ATT&CK to track which techniques you can hunt for and which have gaps
  5. Hunt regularly by scheduling hunts even when there’s no indication of compromise, since consistency builds skill and coverage
  6. Operationalize everything by converting every finding into a detection rule, logging improvement, or configuration change