OpenAI - Trace-level loop / long-duration anomalies

let window = 1d; OpenAIChatCompletions | where TimeGenerated > ago(window) | extend ActorUser = tostring(AdditionalFields.input_user) | extend DurationMs = iff(isnotempty(EventEndTime) and isnotempty(EventStartTime), todouble(datetime_diff('millisecond', EventEndTime, EventStartTime)), todouble(0)) | summarize Requests = count(), ToolCalls = countif(isnotempty(ToolName)), DistinctTools = dcount(ToolName), MaxDurationMs = max(DurationMs), AvgDurationMs = avg(DurationMs), TotalOutputTokens = sum(todouble(OutputTokensUsed)) by ActorUser, ModelName, bin(TimeGenerated, 5m) | where (ToolCalls >= 10 and DistinctTools <= 2) or MaxDurationMs > 120000 | project TimeGenerated, ActorUser, ModelName, Requests, ToolCalls, DistinctTools, MaxDurationMs, AvgDurationMs, TotalOutputTokens | order by TimeGenerated desc

Explanation

This query is designed to identify unusual patterns in OpenAI request traces that might indicate problematic behavior, such as runaway loops or excessive resource usage. Here's a simplified explanation:

Purpose: The query looks for two main types of anomalies:
- Repeated tool calls with low diversity: This means there are many calls to the same or very few tools within a short time frame (10 or more calls in 5 minutes using 2 or fewer distinct tools).
- Long-duration requests: Any single request that takes more than two minutes to complete.
Data Source: It analyzes data from OpenAI chat completions over the past day.
Process:
- It calculates the duration of each request and counts the number of tool calls and distinct tools used.
- It summarizes this information by user, model, and time, grouping data into 5-minute intervals.
- It filters the results to find cases where there are many tool calls with low diversity or where a request takes too long.
Output: The query outputs details such as the time, user, model, number of requests, tool calls, distinct tools, maximum and average duration, and total output tokens used.
Use Case: This helps in identifying potential issues like stuck loops, cost amplification, or denial-of-service-like behavior, which can impact system performance or costs.
Tags and Techniques: The query is tagged with relevant tactics and techniques for categorizing the type of behavior it detects, such as impact and execution, and is part of a custom monitoring setup for OpenAI usage.

Details

David Alonso

Released: June 8, 2026

Tables

OpenAIChatCompletions

Keywords

OpenAIChatCompletionsActorUserAdditionalFieldsEventEndTimeEventStartTimeASimAgentEventLogsToolNameModelNameOutputTokensUsedTimeGenerated

Operators

letwhereextendiffisnotemptytodoubledatetime_diffsummarizecountcountifdcountmaxavgsumbybinorprojectorder bydesc

Tactics

ImpactExecution

MITRE Techniques

T1499 T1059

KQL Search