Query Details

Agent Grounding Source Enumeration

Query

id: 7b8c9d0e-5555-4d11-9105-0123456789c5
name: Agent - Grounding / retrieval source enumeration (RAG recon)
description: |
  Hunts Foundry / Agent Service agents that touch an unusually diverse set
  of retrieval / grounding sources (URLs / hosts pulled by tools) in a
  short window - the RAG-equivalent of port-scanning and a common
  precursor to data discovery and staged exfiltration. The Foundry
  equivalent of the Copilot grounding-source-enumeration hunt.

  Source hosts are extracted from gen_ai.tool.call.arguments and
  gen_ai.tool.call.result in the AppDependencies span property bag
  (Properties), so AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED must be
  set for the arguments to be present. Pivots per agent
  (gen_ai.agent.name).
query: |
  let recentWindow = 1h;
  let baselineWindow = 14d;
  let sources =
      AppDependencies
      | where isnotempty(Properties["gen_ai.tool.call.arguments"])
          or isnotempty(Properties["gen_ai.tool.call.result"])
      | extend
          Agent      = tostring(Properties["gen_ai.agent.name"]),
          ToolArgs   = tostring(Properties["gen_ai.tool.call.arguments"]),
          ToolResult = tostring(Properties["gen_ai.tool.call.result"])
      | extend Host = tolower(extract(@"https?://([A-Za-z0-9.\-]+)", 1, strcat(ToolArgs, " ", ToolResult)))
      | where isnotempty(Host);
  let recent =
      sources
      | where TimeGenerated > ago(recentWindow)
      | summarize
          RecentDistinctSources = dcount(Host),
          RecentSampleSources   = make_set(Host, 25),
          RecentCalls           = count()
          by Agent;
  let baseline =
      sources
      | where TimeGenerated between (ago(baselineWindow) .. ago(recentWindow))
      | summarize BaselineDistinctSources = dcount(Host) by Agent;
  recent
  | join kind=leftouter baseline on Agent
  | extend BaselineDistinctSources = coalesce(BaselineDistinctSources, 0)
  | extend SpikeRatio = iff(BaselineDistinctSources > 0,
                            todouble(RecentDistinctSources) / todouble(BaselineDistinctSources),
                            todouble(RecentDistinctSources))
  | where RecentDistinctSources >= 15 and (BaselineDistinctSources == 0 or SpikeRatio >= 5.0)
  | project Agent, RecentDistinctSources, BaselineDistinctSources, SpikeRatio,
            RecentCalls, RecentSampleSources
  | order by SpikeRatio desc, RecentDistinctSources desc
tactics:
  - Discovery
  - Collection
techniques:
  - T1083
  - T1213
tags:
  - Sentinel-As-Code
  - Custom
  - Foundry
  - AI

Explanation

This query is designed to identify unusual behavior by agents within the Foundry/Agent Service that might indicate potential data discovery or exfiltration activities. Here's a simplified breakdown of what the query does:

  1. Purpose: The query looks for agents that access a wide variety of URLs or hosts in a short period, which could be a sign of reconnaissance activity similar to port scanning.

  2. Data Source: It analyzes data from the AppDependencies table, focusing on the properties gen_ai.tool.call.arguments and gen_ai.tool.call.result to extract host information. This requires a specific setting (AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED) to be enabled.

  3. Time Windows:

    • Recent Window: The last hour (1h).
    • Baseline Window: The previous 14 days (14d).
  4. Analysis:

    • Recent Activity: Counts distinct hosts accessed by each agent in the recent window and collects a sample of these hosts.
    • Baseline Activity: Counts distinct hosts accessed by each agent in the baseline window.
  5. Comparison:

    • Joins recent and baseline data to compare the number of distinct hosts accessed.
    • Calculates a "Spike Ratio" to determine if there's a significant increase in host access in the recent window compared to the baseline.
  6. Alert Criteria:

    • Flags agents that accessed at least 15 distinct hosts recently.
    • Highlights agents with no baseline activity or a spike ratio of 5 or more, indicating a significant increase in host access.
  7. Output:

    • Lists agents with their recent and baseline distinct host counts, spike ratio, number of recent calls, and a sample of recent hosts accessed.
    • Orders results by spike ratio and number of recent distinct sources.
  8. Security Context:

    • The query is associated with tactics like Discovery and Collection, and techniques such as T1083 (File and Directory Discovery) and T1213 (Data from Information Repositories).
    • Tagged for use in Sentinel-As-Code, Foundry, AI, and custom scenarios.

In essence, this query helps detect agents that might be probing a network for data, which could be a precursor to data theft or other malicious activities.

Details

David Alonso profile picture

David Alonso

Released: June 8, 2026

Tables

AppDependencies

Keywords

AgentAppDependenciesPropertiesHostSourcesAgentServiceDataDiscoveryExfiltrationAIContentRecording

Operators

letisnotemptytostringtolowerextractstrcatwheresummarizedcountmake_setcountagobetweenjoinkind=leftoutercoalesceifftodoubleprojectorder by

Actions