Building a Security Operations Center (SOC) is one of the most impactful and expensive security investments an organization can make. A 24x7 in-house SOC costs $2-7 million per year in staffing, technology, and facilities, while managed detection and response (MDR) services offer comparable outcomes starting at around $50K/year for 100 endpoints. The WEF 2025 Global Cybersecurity Outlook found that 67% of organizations report a moderate-to-critical cybersecurity skill gap, making the build-vs-buy decision more consequential than ever.

This guide covers SOC design, technology stack, staffing, and the shift toward AI-augmented operations that is redefining what a SOC looks like in 2026.

SOC Models

In-House SOC

A fully staffed, organization-owned security operations function makes sense for large enterprises with complex, heterogeneous environments, for organizations with strict regulatory requirements demanding internal control (defense, finance, healthcare), when there’s enough scale to justify the investment (typically 5,000+ endpoints), and when the organization-specific threat landscape requires deep institutional knowledge.

The cost structure includes 24x7 coverage requiring minimum 8-12 analysts (3 shifts plus coverage for PTO and turnover), SIEM, EDR/XDR, SOAR, and threat intelligence platform licensing, plus facility costs, training, and continuous professional development. Total: $2-7M/year depending on location, scale, and tooling.

MDR (Managed Detection and Response)

An external provider handles monitoring, detection, and response. This works well for mid-market organizations (500-5,000 endpoints), when there’s limited security hiring capacity, when you need 24x7 coverage without building a full team, or when there’s willingness to trust an external party with detection and containment decisions.

The cost structure runs around $50K/year for 100 endpoints and approximately $500K/year for 10,000 endpoints. This typically includes EDR/XDR technology, 24x7 monitoring, and response. Gartner estimated that 50% of organizations use MDR as of 2025.

Hybrid SOC

An internal security team augmented by MDR for off-hours coverage, specialized capabilities, or overflow. This makes sense when an organization wants internal control during business hours with 24x7 external coverage, when a core team handles tuning, threat hunting, and escalation while MDR handles alert triage and initial response, or when budget supports a small internal team but not full 24x7 staffing.

This is the most common model for organizations with 2-5 internal security staff.

Technology Stack

Core Components

ComponentFunctionExamples
SIEMLog aggregation, correlation, detectionMicrosoft Sentinel, Splunk, Google Chronicle, Elastic Security
EDR/XDREndpoint/cross-domain detection and responseCrowdStrike Falcon, SentinelOne, Microsoft Defender, Cortex XDR
SOARAutomation, orchestration, case managementPalo Alto XSOAR, Splunk SOAR, Tines, Torq
Threat IntelligenceIOCs, TTPs, campaign trackingRecorded Future, Mandiant Advantage, MISP, OpenCTI
TicketingIncident tracking, workflow, metricsServiceNow SecOps, Jira, PagerDuty

SIEM Selection Criteria

The SIEM is the SOC’s central nervous system. Key decision factors include data ingestion cost model (per-GB like traditional Splunk, per-endpoint, workload-based like the new Splunk/Cisco model, or fixed-price like Google Chronicle), detection engineering (quality of built-in rules, custom rule flexibility, MITRE ATT&CK mapping), investigation workflow (query language, dashboard quality, case management integration), AI capabilities (natural language investigation, automated triage, suggested remediation), and integration (data source connectors, SOAR integration, threat intelligence feeds).

SOAR and Automation

SOAR platforms are evolving from script-heavy playbook engines to AI-driven orchestration.

Generation 1 (mid-2010s) featured rule-based automation with rigid playbooks. Generation 2 (2020-2023) added GenAI-augmented playbooks with natural language interfaces. Generation 3 (2024+) brings agentic AI platforms where AI agents autonomously investigate, triage, and take containment actions.

Leading agentic SOC platforms include Exaforce, Dropzone AI, Radiant Security, and D3 Security Morpheus. These platforms can investigate alerts end-to-end, correlate data across sources, and recommend or execute containment, reducing the analyst role from alert triage to automation oversight and exception handling.

Data Sources

A SOC is only as effective as its telemetry. Prioritize these data sources:

Tier 1 (Essential) includes endpoint telemetry (EDR/XDR), authentication logs (Active Directory, Entra ID, Okta, cloud IAM), email security alerts, and cloud audit logs (CloudTrail, Azure Activity Log, GCP Audit Logs).

Tier 2 (Important) includes network flow data (NetFlow, VPC Flow Logs), DNS logs, web proxy / CASB logs, and firewall logs.

Tier 3 (Valuable) includes application security logs (WAF, API gateway), DLP alerts, physical security (badge access), and vulnerability scanner findings.

SOC Staffing

Roles and Responsibilities

RoleResponsibilityExperience Level
SOC Analyst (L1)Alert triage, initial classification, escalation0-2 years
SOC Analyst (L2)Deep investigation, containment, forensics2-5 years
SOC Analyst (L3)Advanced analysis, malware reverse engineering, threat hunting5+ years
Detection EngineerWrite and tune detection rules, reduce false positives3-5 years
Automation EngineerBuild and maintain SOAR playbooks and integrations3-5 years
SOC ManagerOperations management, metrics, reporting, hiring7+ years
Threat HunterProactive hypothesis-driven hunting5+ years

AI Impact on Staffing

AI copilots and agentic platforms are reshaping SOC staffing in 2026. L1 triage is automatable, with AI agents handling alert triage and reducing the need for dedicated L1 analysts. L2 analysts become AI supervisors, reviewing and validating AI-generated investigations rather than building them from scratch. Detection engineering grows, with demand for engineers who can write detection logic, tune AI models, and manage automation. The net effect is that SOC teams are getting smaller but more skilled, as junior alert-triage positions decline while senior engineering roles expand.

A Singaporean healthcare organization reported a 70% reduction in response times after deploying SOC AI copilots, while maintaining the same team size.

Shift Schedules

24x7 coverage options:

ModelStaff RequiredProsCons
4x12 (four 12-hour shifts)8-10 analystsFewer handoffs, simpler schedulingLong shifts cause fatigue
Follow-the-sun (geographically distributed)6-9 analystsNormal working hours for everyoneCoordination complexity
Hybrid (MDR off-hours)3-5 analysts + MDRMost cost-effective 24x7 coverageDependency on external provider

Detection Engineering

Detection-as-Code

Treat detection rules as software: version controlled, tested, and reviewed. Store detection rules in Git alongside infrastructure code. Use Sigma format for vendor-agnostic rule writing. Implement CI/CD for detections by linting, testing against sample data, and deploying to SIEM. Map every detection to MITRE ATT&CK techniques for coverage tracking. Measure and track detection coverage against the ATT&CK matrix to identify gaps.

Tuning and False Positive Management

False positives are the primary driver of SOC inefficiency and analyst burnout. Track false positive rate per detection rule. Rules exceeding 80% false positive rate should be tuned or disabled. Use allowlists judiciously since every allowlist entry is a potential detection blind spot. Review tuning decisions quarterly because environmental changes may invalidate previous tuning. AI-based noise reduction tools like Semgrep and Fortify Aviator can reduce SAST false positives by up to 98%.

SOC Metrics

Operational Metrics

MetricDescriptionTarget
MTTD (Mean Time to Detect)Time from compromise to detectionUnder 24 hours
MTTR (Mean Time to Respond)Time from detection to containmentUnder 1 hour (critical)
Alert volumeTotal alerts per dayTrending down through tuning
False positive ratePercentage of alerts that are benignUnder 30%
Escalation ratePercentage of alerts requiring L2+ investigation10-20%
SLA compliancePercentage of incidents resolved within SLAOver 95%

Maturity Metrics

MetricDescriptionIndicates
ATT&CK coveragePercentage of ATT&CK techniques with detection rulesDetection breadth
Automation ratePercentage of alerts handled without human interventionOperational efficiency
Hunting cadenceNumber of completed hunts per monthProactive maturity
Detection-to-production timeTime from threat intel to deployed detectionEngineering agility
Analyst retentionAverage tenure and turnover rateTeam health

Incident Response Integration

The SOC is the first line of defense, but major incidents require escalation to a broader IR team. Define clear escalation criteria establishing when a SOC alert becomes an IR incident. Establish handoff procedures between SOC analysts and IR team. Pre-define containment authorities specifying what SOC analysts can do without approval. Integrate SOC ticketing with IR case management. Conduct joint exercises between SOC and IR teams quarterly.

Getting Started

Phase 1: Foundation (Months 1-3)

Select and deploy SIEM and EDR/XDR platforms. Onboard Tier 1 data sources (endpoint, identity, email, cloud). Hire initial team (minimum 3 analysts for business-hours coverage) or engage MDR. Establish basic detection rules and alert triage procedures.

Phase 2: Operationalization (Months 4-8)

Deploy SOAR for repetitive triage and response automation. Onboard Tier 2 data sources (network, DNS, proxy). Implement detection-as-code workflow. Establish metrics tracking and regular reporting. Expand to 24x7 coverage (MDR hybrid if not fully staffed).

Phase 3: Maturation (Months 9-14)

Launch threat hunting program. Deploy AI copilots for analyst augmentation. Map and track MITRE ATT&CK detection coverage. Implement advanced automation (AI-assisted triage, automated containment). Optimize based on metrics by tuning detections, reducing false positives, and improving MTTD/MTTR.