Query Details
id: a1b2c3d4-1004-4a11-9c01-0123456789a4
name: Copilot Studio - System-prompt disclosure attempt and leak
description: |
Correlates a user message asking the agent to disclose its hidden
configuration ("show your system prompt", "what are your instructions",
"repeat the words above") with a bot response in the same conversation
that leaks instruction-style markers ("You are", "Your role is",
"system:", "## Instructions"). Together these indicate a successful or
near-successful system-prompt extraction.
Joins inbound and outbound AppEvents turns on conversationId. Both the
prompt and response text require "Log sensitive properties" to be
enabled on the agent's Application Insights settings.
severity: High
requiredDataConnectors:
- connectorId: ApplicationInsights
dataTypes:
- AppEvents
queryFrequency: PT1H
queryPeriod: PT1H
triggerOperator: gt
triggerThreshold: 0
enabled: true
tactics:
- Discovery
- Collection
relevantTechniques:
- T1082
- T1213
query: |
let disclosureMarkers = dynamic([
"system prompt", "your instructions", "your system message",
"repeat the words above", "what are your instructions",
"print your prompt", "reveal your prompt", "initial instructions",
"your configuration", "your guidelines"
]);
let asks =
AppEvents
| where Name == "BotMessageReceived"
| extend ConvId = tostring(Properties["conversationId"]),
Prompt = tolower(tostring(Properties["text"]))
| where isnotempty(Prompt)
| where Prompt has_any (disclosureMarkers)
| project AskTime = TimeGenerated, ConvId, Prompt = substring(tostring(Properties["text"]), 0, 512),
UserId, ChannelId = tostring(Properties["channelId"]), ClientIP;
let leaks =
AppEvents
| where Name == "BotMessageSend"
| extend ConvId = tostring(Properties["conversationId"]),
Output = tostring(Properties["text"])
| where isnotempty(Output)
| where Output matches regex @"(?i)(you are an?|your role is|system:|##\s*instructions|the assistant)"
| project LeakTime = TimeGenerated, ConvId, Output = substring(Output, 0, 1024);
asks
| join kind=inner leaks on ConvId
| where LeakTime between (AskTime .. (AskTime + 10m))
| extend AccountName = iff(isempty(UserId), "unknown-agent", UserId)
| project
AskTime, LeakTime, AccountName, ConvId, ChannelId, ClientIP,
Prompt, Output,
LagSeconds = datetime_diff('second', LeakTime, AskTime)
| order by AskTime desc
entityMappings:
- entityType: Account
fieldMappings:
- identifier: Name
columnName: AccountName
- entityType: IP
fieldMappings:
- identifier: Address
columnName: ClientIP
eventGroupingSettings:
aggregationKind: SingleAlert
incidentConfiguration:
createIncident: true
groupingConfiguration:
enabled: true
reopenClosedIncident: false
lookbackDuration: PT6H
matchingMethod: Selected
groupByEntities:
- Account
groupByAlertDetails: []
groupByCustomDetails: []
version: 1.0.0
kind: Scheduled
tags:
- Sentinel-As-Code
- Custom
- CopilotStudio
- AI
- SystemPromptDisclosure
This KQL query is designed to detect attempts to extract hidden system prompts from a chatbot or virtual assistant. Here's a simple breakdown of what the query does:
Purpose: The query identifies when a user tries to get the chatbot to reveal its internal instructions or system prompts, and whether the bot inadvertently discloses any part of those instructions in its response.
Data Source: It uses data from Application Insights, specifically the AppEvents data type, which logs events related to the chatbot's interactions.
Detection Logic:
Correlation: The query joins these user requests and bot responses based on the conversation ID to see if they occur within a 10-minute window, indicating a potential successful or near-successful extraction of the system prompt.
Output: If such an event is detected, it logs details like the time of the request and response, user ID, channel ID, client IP, and the content of the request and response. It also calculates the time difference between the request and the response.
Alerting: If any such events are found, an alert is generated with a high severity level. The alert can be grouped by user account for incident management purposes.
Configuration: The query runs every hour and checks data from the past hour. It is set to trigger an alert if any matching events are found.
Overall, this query helps in monitoring and securing chatbot systems by identifying and alerting on potential leaks of sensitive system prompts.

David Alonso
Released: June 8, 2026
Tables
Keywords
Operators