Query Details

Agent Prompt Injection Signals

Query

id: b2c3d4e5-2005-4b22-9d01-0123456789c5
name: Copilot Studio - Prompt-injection signals in user messages
description: |
  Broad sweep of inbound user messages for prompt-injection / instruction-
  override / tool-coercion phrasing, ranked by how many distinct markers a
  conversation triggers. Companion to the analytic rule, surfacing weaker
  single-marker hits for tuning and proactive review. Requires "Log
  sensitive properties".
query: |
  let lookback = 1d;
  let injectionMarkers = dynamic([
      "ignore previous instructions", "ignore all previous", "ignore your instructions",
      "disregard the above", "disregard previous", "you are now", "act as",
      "developer mode", "do anything now", "dan mode", "jailbreak",
      "reveal your system prompt", "show your system prompt", "print your instructions",
      "repeat the words above", "what are your instructions", "bypass your rules",
      "without any restrictions", "pretend you are", "from now on you",
      "new instructions:", "system:", "override", "you must now"
  ]);
  AppEvents
  | where TimeGenerated > ago(lookback)
  | where Name == "BotMessageReceived"
  | extend
      ConvId    = tostring(Properties["conversationId"]),
      ChannelId = tostring(Properties["channelId"]),
      Text      = tolower(tostring(Properties["text"]))
  | where isnotempty(Text)
  | mv-apply Marker = injectionMarkers to typeof(string) on (
        where Text contains Marker
        | summarize Markers = make_set(Marker)
    )
  | extend MarkerCount = array_length(Markers)
  | project TimeGenerated, UserId, ConvId, ChannelId, MarkerCount, Markers,
            Text = substring(tostring(Properties["text"]), 0, 1024), ClientIP
  | order by MarkerCount desc, TimeGenerated desc
tactics:
  - InitialAccess
  - DefenseEvasion
techniques:
  - T1566
  - T1562
tags:
  - Sentinel-As-Code
  - Custom
  - CopilotStudio
  - AI

Explanation

This query is designed to monitor incoming user messages for signs of prompt-injection or attempts to override instructions, which could indicate potential security risks or misuse. Here's a simple breakdown:

  1. Time Frame: It looks at messages from the past day (lookback = 1d).

  2. Markers: It checks for specific phrases (like "ignore previous instructions" or "jailbreak") that might suggest an attempt to manipulate or bypass the system's intended behavior.

  3. Data Source: It examines events where a bot received a message (Name == "BotMessageReceived").

  4. Processing:

    • Extracts conversation and channel IDs, and converts the message text to lowercase for uniformity.
    • Filters out messages that are empty.
    • Checks each message for the presence of any of the specified markers.
    • Counts how many different markers are found in each message.
  5. Output: It lists the messages, showing when they were received, user and conversation details, the number of markers found, the specific markers, a snippet of the message text, and the client's IP address.

  6. Sorting: Results are sorted by the number of markers found (highest first), then by the time they were generated (most recent first).

  7. Purpose: This helps identify conversations that might need further review for potential security threats or misuse, aiding in tuning and proactive monitoring.

  8. Security Context: The query is associated with tactics like Initial Access and Defense Evasion, and techniques such as Phishing (T1566) and Impair Defenses (T1562). It is tagged for use with Sentinel-As-Code, custom monitoring, and AI-related activities.

Details

David Alonso profile picture

David Alonso

Released: June 8, 2026

Tables

AppEvents

Keywords

AppEvents

Operators

letdynamicwhereagoextendtostringtolowerisnotemptymv-applycontainssummarizemake_setarray_lengthprojectsubstringorder by

Actions