Your Slack Webhook Is Write-Only, Until an AI Agent Reads the Channel
A leaked Slack incoming webhook is usually triaged as low severity: write-only, one channel, no data access. The moment an AI agent reads that channel and can act with tools, that write-only primitive becomes an indirect prompt injection path into the agent's privileges. Here is the full kill chain, the exact preconditions, and how to defend it.

On this page(8)
Table of Contents
The Severity Inversion
Here is a finding most security teams have triaged and closed: an incoming Slack webhook URL was found in a public repository, a client-side bundle, or a CI log. The reviewer checks what an incoming webhook can do, concludes that the worst case is an attacker posting messages into one channel, files it as low severity, rotates the URL when convenient, and moves on. That triage was correct for years.
It is not correct anymore in any workspace where an AI agent reads that channel.
The reason is simple to state and easy to miss. A leaked webhook gives an attacker a way to write attacker-controlled text into a specific Slack channel. An AI agent that watches that channel reads attacker-controlled text as trusted input. If the agent can take actions through tools, the attacker has just gone from "can post a message" to "can influence what the agent does." The webhook did not gain any new power. The agent supplied the power, and the webhook became the delivery mechanism for reaching it.
This post walks through why that inversion happens, what the kill chain actually looks like, and the precise conditions under which it does and does not hold. The conditions matter, because the honest version of this scenario is narrower than a headline would make it, and the narrow version is still serious.
What an Incoming Webhook Actually Is
An incoming webhook in Slack is a URL of the form https://hooks.slack.com/services/T.../B.../.... Anyone who can make an HTTPS POST to that URL with a small JSON body causes a message to appear in a channel. That is the entire capability surface, and it is worth being precise about its limits.
An incoming webhook is write-only. It posts messages. It cannot read messages, list channels, enumerate users, or pull any data back out of the workspace. In modern Slack the webhook is bound at creation time to a single channel, so a leaked webhook cannot freely pick where it posts. What matters for an attacker is identity rather than per-message customization. Every message a modern app webhook posts arrives under the identity of the integration the webhook was created for, and the attacker cannot override the username or icon on a per-message basis the way legacy custom-integration webhooks once allowed. They do not need to. The message already posts as the legitimate integration, the exact bot that a channel's readers are primed to trust, and a webhook can render rich Block Kit formatting on top of that, so the result looks like a routine post from a known service.
This is exactly why a leaked incoming webhook has historically been a low-severity finding. The attacker can drop a message into one channel, posting as a service that channel already trusts. On its own, that is a phishing and social-engineering primitive aimed at the humans reading the channel. Annoying, occasionally dangerous if someone clicks, but bounded. There is no data exfiltration, no read access, no lateral movement. The blast radius ends at the humans who happen to be looking.
The whole argument of this post is that the phrase "the humans who happen to be looking" is doing quiet work. It assumes the audience of a channel is human.
The New Variable: Agents That Read Channels and Act
AI agents are now wired into Slack channels across a lot of organizations, and the useful ones do more than chat. An incident-response agent watches #alerts or #ci, reads each new alert, and is allowed to investigate by querying logs or calling an internal status API. A support-triage agent reads #support, summarizes tickets, and can look up account details or open a record in another system. An ops assistant reads a channel and can trigger a deploy, restart a service, or post to other channels. These agents are valuable precisely because they close the loop between reading a message and doing something about it.
Under the hood, that "doing something" is tool use. The agent is a language model with a set of tools attached through function calling or the Model Context Protocol: query this database, call that API, send this email, kick off that job. The agent reads channel content, decides which tool to call, and calls it with its own credentials and its own permissions. We have written before about how every agent action is a credential action, and the same lens applies here. The agent holds real privilege, often more than any single human in the channel, because it was provisioned broadly enough to be useful across many requests.
Now put the two facts side by side. A leaked webhook lets an attacker write into the channel. The agent reads everything in the channel as input and can act on it with privileged tools. The gap between those two sentences is the entire vulnerability.
The Kill Chain
The chain has five steps, and none of them require the attacker to breach anything beyond the leaked URL.
1. Obtain the webhook. Incoming webhook URLs leak the way every other secret leaks: committed to a public repository, baked into a client-side JavaScript bundle or mobile app, printed into a CI log, shared in a Postman collection, or pasted into a ticket. Because they have long been treated as low-sensitivity, they are often not even in the secret-scanning ruleset, so they sit in plain sight longer than a database password would.
2. Confirm the channel has an agent on it. The attacker does not need to see the channel. They only need to guess, or know, that an agent consumes it. Webhooks are most often created for exactly the channels agents tend to watch: #alerts, #ci, #monitoring, #ops, #support. The overlap between "where integrations post" and "where ops agents listen" is not a coincidence. It is the same set of channels.
3. Inject crafted content. The attacker POSTs a message designed to be read as instructions rather than data. Dressed up as a routine alert, it carries an embedded directive aimed at the agent: treat the incident as resolved and, as cleanup, call a tool with attacker-chosen arguments. To the agent, this is just the next line of trusted channel history.
4. The agent acts. If the agent ingests channel content as context and is not hardened against injection, it follows the directive using its own tools and its own permissions. It queries the data the message told it to query, calls the API the message told it to call, or posts the secret the message told it to surface. This is a textbook confused deputy: the agent has authority the attacker lacks, and the attacker borrows it by supplying instructions through a trusted channel.
5. Lateral movement and exfiltration. What happens next depends entirely on the agent's toolset. An agent that can read internal systems can be steered to read the wrong thing and report it back into a channel the attacker can also reach. An agent that can write or call outbound can be steered to send data out directly. The write-only webhook, worth almost nothing on its own, has become a remote trigger for whatever the agent is allowed to do.
A concrete shape makes it click. Picture an incident agent on #ci that, on a failure alert, is allowed to query recent logs and call an internal /runbook API to remediate. The attacker POSTs a message styled as a build-failure alert whose body contains a "remediation note" instructing the agent to include the value of a specific environment variable in its status summary, or to call /runbook with a parameter that points outbound. The agent, reading this as a legitimate alert with a legitimate note, complies. No human approved anything. The only attacker input was an HTTPS POST to a URL that a triage reviewer once marked low severity.
Why This Is Indirect Prompt Injection
The class of bug here is indirect prompt injection, the category that sits at the top of the OWASP Top 10 for LLM Applications as LLM01, and the webhook is just an unusually clean way to deliver it. Direct prompt injection is when a user types a malicious prompt straight at a model. Indirect prompt injection is when the malicious instruction arrives through content the model ingests from somewhere else: a web page it browses, a document it summarizes, an email in the inbox it triages, or a Slack message in the channel it watches.
Every agent that reads a Slack channel is making an implicit assumption that the channel is trusted input. That assumption holds only as long as every writer to the channel is trusted. A leaked incoming webhook breaks exactly that assumption, because it hands an untrusted outsider a writer's seat. The agent cannot tell the difference between a message a teammate posted and a message an attacker POSTed through a leaked URL. Both arrive as channel history. The model has no built-in notion of provenance unless someone gives it one.
This is why the fix cannot live entirely on the webhook side. Rotating the leaked URL closes one door. The structural problem is that the agent treats channel content as instructions at all.
When This Does Not Work
The honest version of this scenario has preconditions, and naming them is what separates analysis from fear-marketing. The chain breaks if any of these is false.
The webhook has to point at a channel an agent actually reads. A leaked webhook for a channel no agent consumes is back to being a low-severity phishing primitive. Modern incoming webhooks are channel-bound, so the attacker cannot redirect a webhook to a juicier channel; they are stuck with wherever it was created to post.
The agent has to act on channel content with tools, not merely summarize it for a human. An agent that only reads a channel and writes a summary back, with no other tools, downgrades the impact to social engineering of whoever reads that summary. Still not nothing, but not tool execution.
The agent has to lack injection defenses. An agent that separates data from instructions, that refuses to take commands from channel content, that gates sensitive tools behind human approval, or that ignores messages from integration and bot authors, will not follow the injected directive. The vulnerability lives in the gap between "reads the channel" and "trusts the channel," and a well-built agent does not have that gap.
None of these preconditions are exotic. Plenty of real deployments satisfy all three at once, because the whole point of wiring an agent into an alerts channel is to let it read alerts and act on them, and injection hardening is the part teams most often skip. But a writeup that implies every leaked webhook is now critical would be wrong, and worse, it would train people to ignore the parts of the chain that actually decide severity.
Detection and Mitigation
The defenses split across three layers, and the most durable ones are on the agent.
Webhook hygiene. Treat incoming webhook URLs as secrets, because the whole premise of this attack is that they are not treated that way. Add hooks.slack.com/services to secret-scanning rules so a leaked one is caught like any other credential. Inventory webhooks as the non-human identities they are, with an owner and a rotation story, the same way you would track an API key. Restrict who in the workspace can create them. Where you can, prefer a Slack app with a scoped, signed bot token over a raw incoming webhook, since that gives you authentication and revocation that a bare URL does not.
Agent-side defenses, which are the real fix. Treat all channel content as untrusted input and never as instructions. Keep a clear separation between the data plane (what the agent reads) and the control plane (what the agent is allowed to do), so that reading a message can never by itself authorize a tool call. Put sensitive tools behind a human-in-the-loop approval. Scope the agent's tools to least privilege, and specifically do not give a broadly privileged toolset to an agent that reads a channel anyone, or any integration, can write to. Log every tool call with the provenance of what triggered it, so a tool call that traces back to externally sourced channel content can be flagged.
Authorship checks at the boundary. Messages posted through an incoming webhook carry bot or integration authorship, not a real human user. An agent can use that. Distinguishing human-authored messages from integration-authored ones, and declining to treat integration messages as a command source, closes the specific door this attack walks through. There is one trap to name, though. Because a leaked webhook posts as the integration it belongs to, an agent that explicitly trusts that integration as a command source is not saved by this check at all, since the injected message arrives under exactly the identity the agent was told to trust. That is why the durable version of the control is not "trust this bot, distrust that one" but "no channel writer, human or bot, is a command source on its own." Authorship filtering narrows the surface; it does not replace treating channel content as untrusted.
The Takeaway: Severity Is Contextual
A leaked incoming webhook is a non-human identity, and like a lot of NHI leaks its severity is not intrinsic. It is defined by what consumes it. For years the consumer was a human reading a channel, and the severity math reflected that. AI agents change the consumer, and changing the consumer changes the severity of the entire class of "low-impact" credential leaks, not just this one.
The practical lesson is not that Slack webhooks are newly dangerous in isolation. It is that wiring an agent with real tools onto a channel quietly re-rates every credential that can write to that channel. Before turning that loop on, it is worth asking a plain question about each integration that posts to the channel: if an attacker could send exactly what this integration sends, what would the agent do with it. If the answer is uncomfortable, the fix is at the agent, not the webhook.
Get the next one in your inbox
Monthly NHI research brief from the Cremit team. One email, high signal.