Foundry - Agent memory poisoning (write or recall to sensitive action)

Foundry Agent Memory Poisoning

Query

let injectionMarkers = dynamic([
    "ignore previous instructions","disregard previous","you are now",
    "developer mode","do anything now","dan mode","reveal your system prompt",
    "show your system prompt","bypass your rules","without any restrictions",
    "pretend you are","from now on you","store this and remember",
    "remember to always","next time the user asks","in future conversations",
    "save this rule","as a permanent instruction","keep this in memory"
]);
let sensitiveTools = dynamic([
    "code_interpreter","python","shell","bash","powershell","exec","run_code",
    "http_request","fetch","invoke_url","send_email","send_message",
    "execute_sql","query_database","file_write","write_file","upload",
    "create_resource","delete_resource","azure_write","deploy"
]);
let memoryEvents =
    AppDependencies
    | where TimeGenerated > ago(3h)
    | extend
        Agent  = tostring(Properties["gen_ai.agent.name"]),
        ConvId = tostring(Properties["gen_ai.conversation.id"]),
        OpType = tolower(tostring(coalesce(
                    Properties["gen_ai.memory.operation"],
                    Properties["gen_ai.thread.operation"],
                    Properties["microsoft.agent.memory.operation"], ""))),
        MemContent = tolower(tostring(coalesce(
                    Properties["gen_ai.memory.content"],
                    Properties["gen_ai.memory.value"],
                    Properties["gen_ai.thread.message.content"],
                    Properties["microsoft.agent.memory.content"], ""))),
        SpanName = tolower(coalesce(Name, ""))
    | where isnotempty(MemContent) or isnotempty(OpType)
            or SpanName has_any ("memory","thread.message","store","recall");
let writes =
    memoryEvents
    | where (OpType has_any ("write","store","add","upsert","persist"))
            or SpanName has_any ("memory.write","memory.store","memory.add","thread.message.create")
    | where MemContent has_any (injectionMarkers)
    | summarize Hits = count(),
                Samples = make_set(substring(MemContent, 0, 256), 3),
                FirstSeen = min(TimeGenerated),
                LastSeen  = max(TimeGenerated)
            by Agent, ConvId
    | extend Signal = "PoisonedMemoryWrite", Tools = dynamic([]);
let recalls =
    memoryEvents
    | where (OpType has_any ("read","recall","retrieve","get","fetch","search"))
            or SpanName has_any ("memory.read","memory.recall","memory.search","thread.message.list")
    | where MemContent has_any (injectionMarkers)
    | summarize Hits = count(),
                Samples = make_set(substring(MemContent, 0, 256), 3),
                FirstSeen = min(TimeGenerated),
                LastSeen  = max(TimeGenerated)
            by Agent, ConvId;
let sensitiveActs =
    AppDependencies
    | where TimeGenerated > ago(3h)
    | extend Agent    = tostring(Properties["gen_ai.agent.name"]),
             ConvId   = tostring(Properties["gen_ai.conversation.id"]),
             ToolName = tolower(tostring(Properties["gen_ai.tool.name"])),
             ToolType = tolower(tostring(Properties["gen_ai.tool.type"]))
    | where ToolName has_any (sensitiveTools) or ToolType has_any (sensitiveTools)
    | summarize ToolFires = count(), Tools = make_set(ToolName, 8),
                ToolFirst = min(TimeGenerated)
            by Agent, ConvId;
let recallChain =
    recalls
    | join kind=inner sensitiveActs on Agent, ConvId
    | where ToolFirst between (FirstSeen .. (LastSeen + 30m))
    | project Agent, ConvId, FirstSeen, LastSeen, Hits, Samples, Tools,
              Signal = "PoisonedRecallToSensitiveTool";
union writes, recallChain
| extend AccountName = iff(isempty(Agent), "unknown-agent", Agent)
| project LastSeen, AccountName, Agent, ConvId, Signal, Hits, Tools, Samples, FirstSeen
| order by LastSeen desc

Explanation

This query is designed to detect a specific type of security threat known as "memory poisoning" in AI agents. Here's a simplified breakdown of what it does:

Purpose: The query aims to identify when an AI agent's memory is tampered with (poisoned) by malicious instructions, which could later affect its behavior in future sessions.
Detection Mechanism: It uses two main signals to trigger an alert:
- PoisonedMemoryWrite: This signal is triggered when the AI's memory is written with content that matches known malicious patterns (e.g., instructions to ignore previous commands or operate without restrictions).
- PoisonedRecallToSensitiveTool: This signal is triggered when the AI recalls poisoned memory content and uses it with sensitive tools (like code interpreters, shell commands, or database queries) within the same conversation, within a 30-minute window.
Data Sources: The query analyzes data from Application Insights, specifically looking at memory operations and tool usage by the AI agent.
Severity and Frequency: The rule is set to a high severity level and runs every hour, looking back over the past three hours.
Output: If the conditions are met, the query generates an alert with details about the agent, the conversation, and the type of memory poisoning detected.
Incident Management: The query is configured to create incidents in a security monitoring system, grouping related alerts to manage them more effectively.

Overall, this query is part of a security monitoring strategy to detect and respond to potential threats involving AI agents' memory manipulation.

Details

David Alonso

Released: June 8, 2026

Tables

AppDependencies

Keywords

AgentMemoryPoisoningSensitiveToolThreadMessageStoreRecallInjectionMarkersCodeInterpreterShellHttpEmailSqlFileWriteResourceCreateDeleteApplicationInsightsAppDependenciesPersistenceInitialAccessExecution

Operators

letdynamictostringtolowercoalesceisnotemptyhas_anysummarizecountmake_setsubstringminmaxextendiffisemptyprojectorder byjoinbetweenunion

Severity

High

Tactics

PersistenceInitialAccessExecution

MITRE Techniques

T1546 T1566 T1059

Frequency: PT1H

Period: PT3H

Actions

GitHub

KQL Search