Query Details

Agent Indirect Injection In Response

Query

id: b2c3d4e5-2011-4b22-9d01-0123456789d1
name: Copilot Studio - Indirect prompt-injection markers in agent response
description: |
  Surfaces likely indirect (data / workflow) prompt injection: a
  conversation where instruction-override markers ("ignore previous
  instructions", "you are now", "system prompt", "do anything now") appear
  in the agent's OUTPUT (BotMessageSend) while the user's inbound messages
  in that same conversation contain none. When the markers were never
  typed by the user, the most likely source is content the agent ingested
  - a SharePoint document, email, web page, or connector payload - i.e.
  the poisoned-data variant of prompt injection.

  Reads AppEvents and requires "Log sensitive properties" on the agent's
  Application Insights settings so Properties.text is populated for both
  inbound and outbound turns. Review the Samples column to confirm whether
  the marker is genuinely adversarial or benign echoed content.
query: |
  let lookback = 7d;
  let injectionMarkers = dynamic([
      "ignore previous instructions", "ignore all previous", "disregard previous",
      "you are now", "act as", "developer mode", "do anything now", "dan mode",
      "system prompt", "your instructions", "reveal your prompt", "bypass your rules",
      "without any restrictions", "from now on you", "new instructions:"
  ]);
  let inbound =
      AppEvents
      | where TimeGenerated > ago(lookback)
      | where Name == "BotMessageReceived"
      | extend ConvId = tostring(Properties["conversationId"]),
               Text = tolower(tostring(Properties["text"]))
      | summarize UserMarkerHits = countif(Text has_any (injectionMarkers)),
                  UserMsgs = count() by ConvId;
  let outbound =
      AppEvents
      | where TimeGenerated > ago(lookback)
      | where Name == "BotMessageSend"
      | extend ConvId = tostring(Properties["conversationId"]),
               Output = tolower(tostring(Properties["text"]))
      | where isnotempty(Output)
      | where Output has_any (injectionMarkers)
      | summarize BotMarkerHits = count(),
                  Samples = make_set(substring(tostring(Properties["text"]), 0, 240), 5),
                  FirstSeen = min(TimeGenerated), LastSeen = max(TimeGenerated),
                  UserId = take_any(UserId),
                  ChannelId = take_any(tostring(Properties["channelId"])),
                  ClientIP = take_any(ClientIP)
        by ConvId;
  outbound
  | join kind=leftouter inbound on ConvId
  | where coalesce(UserMarkerHits, 0) == 0
  | extend AccountName = iff(isempty(UserId), "unknown-agent", UserId)
  | project FirstSeen, LastSeen, AccountName, ConvId, ChannelId, ClientIP,
            BotMarkerHits, UserMsgs, Samples
  | order by BotMarkerHits desc
tactics:
  - InitialAccess
  - Execution
techniques:
  - T1566
  - T1059
tags:
  - Sentinel-As-Code
  - Custom
  - CopilotStudio
  - AI
  - IndirectInjection

Explanation

This query is designed to detect potential indirect prompt injection attacks in conversations with a bot. Here's a simplified breakdown:

  1. Purpose: The query identifies cases where certain suspicious phrases (like "ignore previous instructions" or "do anything now") appear in the bot's responses but were not present in the user's messages. This suggests that the bot might have picked up these phrases from external content it accessed, such as documents or web pages, rather than directly from the user.

  2. Data Source: It analyzes application events from the bot's logs, specifically looking at messages received by the bot (inbound) and messages sent by the bot (outbound).

  3. Process:

    • It looks back over the last 7 days of data.
    • It checks for specific phrases (injection markers) in both inbound and outbound messages.
    • It counts how many times these markers appear in the bot's responses and checks if they were absent in the user's messages.
    • If the markers are found in the bot's output but not in the user's input, it suggests a potential indirect prompt injection.
  4. Output: The query provides details such as when the suspicious activity was first and last seen, the account involved, conversation ID, channel ID, client IP, and samples of the bot's responses. It orders the results by the number of suspicious markers found in the bot's responses.

  5. Use Case: This is useful for identifying and investigating potential security issues where a bot might be manipulated through indirect means, helping to ensure the integrity of bot interactions.

  6. Tags and Techniques: The query is associated with tactics like Initial Access and Execution, and techniques such as Phishing (T1566) and Command and Scripting Interpreter (T1059). It is tagged for use with Sentinel, AI, and custom security monitoring scenarios.

Details

David Alonso profile picture

David Alonso

Released: June 8, 2026

Tables

AppEvents

Keywords

AppEvents

Operators

letdynamictolowertostringagocountifsummarizehas_anyisnotemptymake_setsubstringminmaxtake_anyjoinkind=leftoutercoalesceiffisemptyprojectorder by

Actions