Query Details
id: 1a02932a-90d3-a1e4-f3d5-628293a2b2c7
name: Microsoft 365 Copilot - Multi-turn jailbreak escalation hunting
description: |
Hunts for Microsoft 365 Copilot conversations where a single user gradually
escalates the prompts against an agent: starting with benign
questions, then adding role-play / persona language, then asking
for policy-bypass output. Multi-turn jailbreaks evade single-turn
detectors and are described in the OWASP Top 10 for Agents 2026
multi-turn-attack category.
Surfaces conversations with three or more escalation steps over
a one-day window so an analyst can review the full transcript.
query: |
// Confirmed schema: per-message JailbreakDetected aggregated by ThreadId.
// A thread with two or more flagged messages is a multi-turn jailbreak
// attempt (single-turn jailbreaks are already covered by the analytic).
let window = 1d;
CopilotActivity
| where TimeGenerated > ago(window)
| where RecordType == "CopilotInteraction"
| extend ThreadId = tostring(LLMEventData.ThreadId)
| mv-expand m = LLMEventData.Messages
| extend
MessageId = tostring(m.Id),
IsPrompt = tobool(m.isPrompt),
JbDetected = tobool(m.JailbreakDetected)
| summarize
Messages = count(),
Prompts = countif(IsPrompt),
JailbreakHits = countif(JbDetected),
PromptJailbreakHits = countif(JbDetected and IsPrompt),
Agents = make_set(AgentName, 4),
Actors = make_set(ActorName, 4),
JbMessageIds = make_set_if(MessageId, JbDetected, 32),
FirstSeen = min(TimeGenerated),
LastSeen = max(TimeGenerated)
by ThreadId, TenantId
| where JailbreakHits >= 2
| extend EscalationRatio = todouble(JailbreakHits) / todouble(Messages)
| order by JailbreakHits desc, EscalationRatio desc, LastSeen desc
tactics:
- DefenseEvasion
- InitialAccess
techniques:
- T1562
- T1059
tags:
- Sentinel-As-Code
- Custom
- Copilot
- AI
This query is designed to identify and analyze conversations involving Microsoft 365 Copilot where a user gradually escalates their prompts in an attempt to bypass security measures. Here's a simplified breakdown of what the query does:
Purpose: The query hunts for conversations where a user starts with harmless questions and gradually escalates to using role-play or persona language, eventually attempting to bypass policies. This type of attack is known as a "multi-turn jailbreak."
Time Frame: It examines interactions within a one-day window.
Data Source: It looks at records from the CopilotActivity table, specifically those labeled as "CopilotInteraction."
Process:
ThreadId) and tenant (TenantId).Criteria for Detection:
Output:
Objective: The goal is to surface these potentially malicious conversations for further review by an analyst, allowing them to examine the full transcript of the interaction.
Security Context: This query is part of a defense strategy against tactics like Defense Evasion and Initial Access, using techniques such as T1562 (Impair Defenses) and T1059 (Command and Scripting Interpreter).
Tags: The query is tagged for use with Sentinel-As-Code, Custom solutions, Copilot, and AI-related activities.

David Alonso
Released: May 20, 2026
Tables
Keywords
Operators