Foundry - Bulk data-exfiltration intent in agent input

Foundry Bulk Data Exfiltration Intent

Query

let exfilMarkers = dynamic([
    "show all", "list all", "list every", "export all", "export the entire",
    "give me all", "give me every", "give me the full list", "full list of",
    "all records", "all customers", "all users", "all bookings", "all orders",
    "all employees", "every record", "every customer", "every user",
    "dump the", "dump all", "entire database", "entire table",
    "complete list of", "everything you have on", "all the data",
    "without any filter", "no limit", "select *"
]);
AppDependencies
| where isnotempty(Properties["gen_ai.input.messages"])
| extend
    Agent     = tostring(Properties["gen_ai.agent.name"]),
    Model     = tostring(Properties["gen_ai.request.model"]),
    ConvId    = tostring(Properties["gen_ai.conversation.id"]),
    ProjectId = tostring(Properties["microsoft.foundry.project.id"]),
    Prompt    = tostring(Properties["gen_ai.input.messages"]),
    SrcIp     = tostring(column_ifexists("ClientIP", ""))
| extend Text = tolower(Prompt)
| where isnotempty(Text)
| mv-apply Marker = exfilMarkers to typeof(string) on (
      where Text contains Marker
      | summarize Markers = make_set(Marker)
  )
| extend AccountName = iff(isempty(Agent), "unknown-agent", Agent)
| project
    TimeGenerated, AccountName, Agent, Model, ProjectId, ConvId,
    Markers, Prompt = substring(Prompt, 0, 1024), SrcIp
| order by TimeGenerated desc

Explanation

This query is designed to detect potential data exfiltration attempts in a system using AI agents. Here's a simplified breakdown:

Purpose: The query identifies when an AI agent is asked to return large amounts of data, which could indicate an attempt to extract data in bulk. This is considered suspicious because the agent is typically supposed to provide only specific, limited information.
How it Works:
- It looks for specific phrases in the input messages to the AI agent that suggest a request for bulk data, such as "show all records" or "select *".
- These phrases are stored in a list called exfilMarkers.
Data Source: The query examines data from the AppDependencies table, specifically looking at the gen_ai.input.messages property to find these suspicious phrases.
Conditions:
- The query only runs if content recording is enabled (indicated by a specific environment variable).
- It checks if any of the suspicious phrases are present in the input messages.
Output: If such phrases are found, the query logs details like the agent's name, model, project ID, conversation ID, and the source IP address. It also captures the suspicious phrases found.
Alerting: If any such instances are detected, an alert is generated. These alerts can be grouped by the account (agent) involved, and incidents are created for further investigation.
Frequency: The query runs every hour and checks data from the past hour.
Severity and Tactics: The alert is marked with medium severity and is associated with tactics like data collection and exfiltration, which are common in cybersecurity threat scenarios.

Overall, this query is part of a security monitoring system to detect and alert on potential unauthorized data access or extraction attempts using AI agents.

Details

David Alonso

Released: June 8, 2026

Tables

AppDependencies

Keywords

AppDependenciesPropertiesAgentModelConvIdProjectIdPromptSrcIpAccountNameTimeGeneratedMarkers

Operators

letdynamicisnotemptytostringcolumn_ifexiststolowermv-applycontainssummarizemake_setiffisemptyprojectsubstringorder by

Severity

Medium

Tactics

CollectionExfiltration

MITRE Techniques

T1213 T1530

Frequency: PT1H

Period: PT1H

Actions

GitHub

KQL Search