Query Details

Foundry Token Cost Spike

Query

id: b4c5d6e7-aaaa-4d04-9207-0123456789d7
name: Foundry - Anomalous token / cost spike per agent
description: |
  Detects a Foundry / Agent Service agent whose token consumption in the
  last hour exceeds three times its 7-day per-hour median (or twice its
  P95). Useful for catching token abuse, runaway agent / tool-call loops,
  and cost-driven denial-of-wallet attacks.

  Unlike the Copilot equivalent, Foundry exposes real usage counters, so
  this rule sums gen_ai.usage.input_tokens + gen_ai.usage.output_tokens
  from the AppDependencies spans (property bag in Properties). The
  absolute floor (HourTokens > 50000) suppresses noise from low-traffic
  agents - tune it and the spike ratio to your tenant. Pair with the
  red-team pacing rule to confirm whether the spike is a tight tool loop.
severity: Medium
requiredDataConnectors:
- connectorId: ApplicationInsights
  dataTypes:
  - AppDependencies
queryFrequency: PT1H
queryPeriod: P7D
triggerOperator: gt
triggerThreshold: 0
enabled: true
tactics:
- Impact
relevantTechniques:
- T1496
- T1499
query: |
  let lookback = 7d;
  let recentWindow = 1h;
  let perHour =
      AppDependencies
      | where TimeGenerated > ago(lookback)
      | where isnotempty(Properties["gen_ai.agent.name"])
      | extend
          Agent  = tostring(Properties["gen_ai.agent.name"]),
          Model  = tostring(Properties["gen_ai.request.model"]),
          InTok  = tolong(Properties["gen_ai.usage.input_tokens"]),
          OutTok = tolong(Properties["gen_ai.usage.output_tokens"])
      | extend TotalTok = coalesce(InTok, 0) + coalesce(OutTok, 0)
      | summarize
          HourTokens = sum(TotalTok),
          HourRuns   = count(),
          AnyModel   = take_any(Model)
          by Agent, Hour = bin(TimeGenerated, 1h);
  let baseline =
      perHour
      | where Hour < bin(now(), 1h) - recentWindow
      | summarize
          MedianHourTokens = percentile(HourTokens, 50),
          P95HourTokens    = percentile(HourTokens, 95)
          by Agent;
  let recent =
      perHour
      | where Hour >= bin(now(), 1h) - recentWindow;
  recent
  | join kind=leftouter baseline on Agent
  | extend
      MedianHourTokens = coalesce(todouble(MedianHourTokens), 0.0),
      P95HourTokens    = coalesce(todouble(P95HourTokens), 0.0)
  | extend SpikeRatio = iff(MedianHourTokens > 0, todouble(HourTokens) / MedianHourTokens, todouble(HourTokens))
  | where HourTokens > 50000
      and (SpikeRatio >= 3.0 or HourTokens > P95HourTokens * 2)
  | extend AccountName = iff(isempty(Agent), "unknown-agent", Agent)
  | extend Model = AnyModel
  | project
      Hour, AccountName, Agent, Model, HourRuns, HourTokens,
      MedianHourTokens, P95HourTokens, SpikeRatio
  | order by SpikeRatio desc
entityMappings:
- entityType: Account
  fieldMappings:
  - identifier: Name
    columnName: AccountName
- entityType: CloudApplication
  fieldMappings:
  - identifier: Name
    columnName: Model
eventGroupingSettings:
  aggregationKind: SingleAlert
incidentConfiguration:
  createIncident: true
  groupingConfiguration:
    enabled: true
    reopenClosedIncident: false
    lookbackDuration: PT6H
    matchingMethod: Selected
    groupByEntities:
    - Account
    groupByAlertDetails: []
    groupByCustomDetails: []
version: 1.0.0
kind: Scheduled
tags:
- Sentinel-As-Code
- Custom
- Foundry
- AI
- OWASP-LLM10

Explanation

This query is designed to detect unusual spikes in token usage by agents in a system called Foundry. It monitors the number of tokens consumed by each agent over the past hour and compares it to the typical usage over the past seven days. If an agent's token usage in the last hour is more than three times its median usage or twice its 95th percentile usage from the past week, it flags this as a potential issue. This helps identify cases of token abuse, infinite loops in agent operations, or cost-related attacks. The query filters out low-traffic agents by setting a minimum threshold of 50,000 tokens per hour to avoid noise. It uses data from Application Insights and generates alerts if any anomalies are detected. The results are sorted by the severity of the spike, and incidents are created for further investigation.

Details

David Alonso profile picture

David Alonso

Released: June 8, 2026

Tables

AppDependencies

Keywords

FoundryAgentTokenConsumptionAppDependenciesPropertiesModelTokensHourAccountNameCloudApplicationAIOWASPLLM

Operators

letagoisnotemptytostringtolongcoalescesummarizesumcounttake_anybinnowpercentilejoinkind=leftoutertodoubleiffisemptyprojectorder by

Actions