A critical vulnerability in vLLM, the popular open-source library for serving large language models, allows attackers to achieve remote code execution by submitting a malicious video link to the API. The flaw, tracked as CVE-2026-22778 (GHSA-4r2x-xpjr-7cvv), puts millions of AI inference servers at risk.

Vulnerability overview

AttributeValue
CVECVE-2026-22778
GHSAGHSA-4r2x-xpjr-7cvv
SeverityCritical
Affected versions0.8.3 through 0.14.0
Fixed version0.14.1
Attack vectorNetwork (API request with video URL)
AuthenticationNone required (if API exposed)
User interactionNone required

vLLM deployment scale

MetricValue
Monthly downloads3+ million
Typical useProduction LLM serving
Supported modelsLLaMA, Mistral, multimodal models
Deployment environmentsGPU clusters, cloud inference

The exploit chain

CVE-2026-22778 chains two separate vulnerabilities to achieve RCE:

Stage 1: ASLR bypass via PIL information leak

When vLLM processes media for multimodal inference, error messages from PIL (Python Imaging Library) can expose memory addresses.

StepAction
1Attacker sends invalid image to multimodal endpoint
2PIL throws error containing heap address
3vLLM returns error to client (leaking address)
4Leaked address is ~10.33 GB before libc in memory
5ASLR reduced from 4 billion guesses to ~8 guesses

Stage 2: Heap overflow in JPEG2000 decoder

vLLM uses OpenCV for video decoding, which bundles FFmpeg 5.1.x containing a heap overflow in the JPEG2000 decoder.

StepAction
1Attacker crafts malicious .mov file with JPEG2000 frames
2Malicious cdef box remaps color channels
3Y (luma) plane mapped into smaller U (chroma) buffer
4Decoder writes large Y plane into undersized U buffer
5Heap overflow overwrites AVBuffer.free pointer
6Pointer overwritten with address of system()
7When buffer released → system("attacker command") executes

Complete attack flow

PhaseAction
1Attacker sends request with video_url pointing to malicious .mov
2vLLM fetches video from URL
3vLLM passes video bytes to cv2.VideoCapture()
4OpenCV’s bundled FFmpeg decodes JPEG2000 frames
5Malicious cdef box triggers heap overflow
6AVBuffer.free pointer overwritten with system()
7Buffer release executes attacker’s command
8Full server compromise achieved

Attack requirements

RequirementDetails
API accessMust reach vLLM API endpoint
AuthenticationNone (default vLLM has no auth)
Multimodal enabledVideo or image input capability
Vulnerable dependenciesvLLM + media processing libraries

Authentication bypass note

Even with non-default api-key enabled configuration, the exploit is feasible through the invocations route that allows payload execution pre-auth.

Impact assessment

Successful exploitation grants arbitrary command execution on the underlying server:

ImpactRisk
Model weight exfiltrationStealing valuable IP
Training data accessSensitive data exposure
Inference log accessPrompts and responses visible
Lateral movementPivot to other systems
CryptominingResource hijacking
Model output manipulationInject malicious content
RansomwareEncrypt AI infrastructure

Blast radius considerations

FactorImpact
Clustered deploymentsSingle exploit may affect multiple nodes
GPU infrastructureHigh-value compute resources
Connected servicesAPI keys, databases accessible
Network positionOften in sensitive internal segments

Scope limitation

Deployment typeVulnerable?
Multimodal (video/image)Yes
Text-only modelsNo
Default pip/docker installYes (if multimodal enabled)

Deployments not serving a video model are not affected.

Remediation

Primary recommendation

Update to vLLM 0.14.1 immediately. This version includes an updated OpenCV release addressing the JPEG2000 decoder flaw.

If immediate upgrade isn’t possible

PriorityMitigation
CriticalDisable video/image input capabilities
CriticalRestrict API access to trusted clients only
HighNever expose vLLM directly to internet
HighNetwork segmentation for AI infrastructure
HighUpdate OpenCV and Pillow to latest versions
MediumMonitor for exploitation indicators

Dependency updates

LibraryAction
vLLMUpgrade to 0.14.1+
OpenCVUpdate to latest (fixes JPEG2000 decoder)
PillowUpdate to latest (reduces info leak risk)
FFmpegEnsure not using vulnerable 5.1.x

Detection

Log monitoring

IndicatorMeaning
Unusual PIL/OpenCV error messagesPossible ASLR bypass attempts
Unexpected child processes from vLLMPost-exploitation activity
Outbound connections from inference serversC2 or exfiltration
High memory usage during media processingOverflow exploitation
Crashes during video processingExploitation attempts

Network indicators

PatternConcern
Video URLs to unknown hostsMalicious payload delivery
Large outbound transfersData exfiltration
Connections to crypto poolsMining malware

AI infrastructure security context

This vulnerability highlights the expanding attack surface of AI infrastructure:

ComponentRisk source
Model servingCore application vulnerabilities
Media processingPIL, OpenCV, FFmpeg dependencies
Model loadingPickle deserialization, config parsing
API layerAuthentication, input validation
DependenciesTransitive vulnerability inheritance

Attack surface comparison

Traditional web appAI inference server
HTTP parsingHTTP parsing + model inference
Database queriesModel loading, GPU operations
File uploadsMedia processing (images, video, audio)
Template renderingOutput generation

AI systems inherit vulnerabilities from the entire media processing stack in addition to traditional web application risks.

Recommendations

For AI infrastructure operators

PriorityAction
CriticalUpgrade vLLM to 0.14.1
CriticalAudit all internet-exposed AI endpoints
HighImplement API authentication
HighNetwork segment AI infrastructure
HighMonitor for anomalous behavior
OngoingMaintain dependency update schedule

For security teams

PriorityAction
HighInventory all vLLM deployments
HighAdd vLLM to vulnerability scanning
HighReview AI infrastructure network access
OngoingTrack AI security advisories

For developers

PriorityAction
HighPin dependency versions
HighImplement input validation for media
MediumConsider disabling multimodal if not needed
OngoingSecurity testing for AI pipelines

Context

CVE-2026-22778 demonstrates that AI security requires the same rigor as traditional application security: minimize exposure, authenticate all access, update dependencies, and assume compromise until proven otherwise.

As organizations deploy multimodal models that process images, video, and audio, they inherit vulnerabilities from PIL, OpenCV, FFmpeg, and their dependencies. The attack surface of AI infrastructure extends far beyond the model itself.

Treat any internet-exposed vLLM instance as potentially compromised until patched.