Copilot Studio - Indirect prompt-injection markers in agent response

Agent Indirect Injection In Response

Query

let lookback = 7d;
let injectionMarkers = dynamic([
    "ignore previous instructions", "ignore all previous", "disregard previous",
    "you are now", "act as", "developer mode", "do anything now", "dan mode",
    "system prompt", "your instructions", "reveal your prompt", "bypass your rules",
    "without any restrictions", "from now on you", "new instructions:"
]);
let inbound =
    AppEvents
    | where TimeGenerated > ago(lookback)
    | where Name == "BotMessageReceived"
    | extend ConvId = tostring(Properties["conversationId"]),
             Text = tolower(tostring(Properties["text"]))
    | summarize UserMarkerHits = countif(Text has_any (injectionMarkers)),
                UserMsgs = count() by ConvId;
let outbound =
    AppEvents
    | where TimeGenerated > ago(lookback)
    | where Name == "BotMessageSend"
    | extend ConvId = tostring(Properties["conversationId"]),
             Output = tolower(tostring(Properties["text"]))
    | where isnotempty(Output)
    | where Output has_any (injectionMarkers)
    | summarize BotMarkerHits = count(),
                Samples = make_set(substring(tostring(Properties["text"]), 0, 240), 5),
                FirstSeen = min(TimeGenerated), LastSeen = max(TimeGenerated),
                UserId = take_any(UserId),
                ChannelId = take_any(tostring(Properties["channelId"])),
                ClientIP = take_any(ClientIP)
      by ConvId;
outbound
| join kind=leftouter inbound on ConvId
| where coalesce(UserMarkerHits, 0) == 0
| extend AccountName = iff(isempty(UserId), "unknown-agent", UserId)
| project FirstSeen, LastSeen, AccountName, ConvId, ChannelId, ClientIP,
          BotMarkerHits, UserMsgs, Samples
| order by BotMarkerHits desc

Explanation

This query is designed to detect potential indirect prompt injection attacks in conversations with a bot. Here's a simplified breakdown:

Purpose: The query identifies cases where certain suspicious phrases (like "ignore previous instructions" or "do anything now") appear in the bot's responses but were not present in the user's messages. This suggests that the bot might have picked up these phrases from external content it accessed, such as documents or web pages, rather than directly from the user.
Data Source: It analyzes application events from the bot's logs, specifically looking at messages received by the bot (inbound) and messages sent by the bot (outbound).
Process:
- It looks back over the last 7 days of data.
- It checks for specific phrases (injection markers) in both inbound and outbound messages.
- It counts how many times these markers appear in the bot's responses and checks if they were absent in the user's messages.
- If the markers are found in the bot's output but not in the user's input, it suggests a potential indirect prompt injection.
Output: The query provides details such as when the suspicious activity was first and last seen, the account involved, conversation ID, channel ID, client IP, and samples of the bot's responses. It orders the results by the number of suspicious markers found in the bot's responses.
Use Case: This is useful for identifying and investigating potential security issues where a bot might be manipulated through indirect means, helping to ensure the integrity of bot interactions.
Tags and Techniques: The query is associated with tactics like Initial Access and Execution, and techniques such as Phishing (T1566) and Command and Scripting Interpreter (T1059). It is tagged for use with Sentinel, AI, and custom security monitoring scenarios.

Details

David Alonso

Released: June 8, 2026

Tables

AppEvents

Keywords

AppEvents

Operators

letdynamictolowertostringagocountifsummarizehas_anyisnotemptymake_setsubstringminmaxtake_anyjoinkind=leftoutercoalesceiffisemptyprojectorder by

Tactics

InitialAccessExecution

MITRE Techniques

T1566 T1059

Actions

GitHub

KQL Search

Copilot Studio - Indirect prompt-injection markers in agent response

Query

Explanation

Details

Actions