Microsoft 365 Copilot - Prompt injection patterns in AI agent prompts

Copilot Prompt Injection Patterns

Query

let injectionPhrases = dynamic([
    "ignore previous instructions",
    "ignore prior instructions",
    "disregard the above",
    "you are now",
    "act as system",
    "system prompt:",
    "developer mode",
    "bypass safety",
    "reveal your prompt",
    "print your instructions",
    "exfiltrate",
    "send to attacker"
]);
let toolCoercion = dynamic([
    "regardless of restrictions",
    "without confirming",
    "skip approval",
    "use admin privileges",
    "elevate to"
]);
CopilotActivity
| where TimeGenerated > ago(7d)
| extend
    Prompt = tostring(LLMEventData.Prompt),
    ToolInput = tostring(LLMEventData.ToolInput),
    ToolName = tostring(LLMEventData.ToolName),
    ConversationId = tostring(LLMEventData.ConversationId)
| extend LowerPrompt = tolower(strcat(Prompt, " ", ToolInput))
| where LowerPrompt has_any (injectionPhrases)
    or LowerPrompt has_any (toolCoercion)
    or LowerPrompt matches regex @"data:[a-z/+.\-]+;base64,[A-Za-z0-9+/=]{200,}"
| extend MatchedInjection = set_intersect(split(LowerPrompt, " "), injectionPhrases)
| project
    TimeGenerated, AgentId, AgentName, ActorName, ActorUserId,
    ToolName, ConversationId, Prompt, ToolInput, SrcIpAddr,
    TenantId
| order by TimeGenerated desc

Explanation

This query is designed to detect potential security threats in Microsoft 365 Copilot by identifying prompt injection patterns. It searches through recent Copilot activities (from the past 7 days) for specific phrases or patterns that suggest someone might be trying to manipulate the AI agent. These include instructions to ignore previous guidance, override roles, or use coercive language to bypass restrictions. It also looks for encoded data that could be used to hide malicious content.

The query checks for:

Known phrases that indicate prompt injection attempts.
Attempts to coerce tools into performing unauthorized actions.
Base64 or data-URI encoded content, which might be used to hide malicious data.

If any of these patterns are found, the query collects relevant information such as the time of the event, agent and actor details, tool names, and IP addresses. This information is then sorted by the time the event occurred, with the most recent events listed first.

The query is part of a broader security strategy, linked with tactics like Initial Access and Execution, and techniques such as Command and Scripting Interpreter (T1059) and User Execution (T1204). It is tagged for use in Sentinel-As-Code, custom security rules, and AI-related monitoring.

Details

David Alonso

Released: May 20, 2026

Tables

CopilotActivity

Keywords

MicrosoftCopilotPromptsToolInputsUserContentAgentSystemPromptSensitiveDataExposureInjectionExfiltrationActivityTimeGeneratedInputNameConversationIdLowerMatchedActorSrcIpAddrTenantInitialAccessExecutionSentinelAsCodeCustomAI

Operators

letdynamictostringtolowerstrcathas_anymatches regexset_intersectsplitprojectorder bydescwhereextendago

Tactics

InitialAccessExecution

MITRE Techniques

T1059 T1204

Actions

GitHub

KQL Search