Query Details

Foundry Capability Disclosure Extended

Query

id: 6f1a2b3c-4d5e-4f16-9307-aaaaaaaaaaa7
name: Foundry - Extended capability / architecture disclosure
description: |
  Detects the v2.0 taxonomy's "capability / architecture disclosure"
  failure mode beyond plain system-prompt extraction: the user probes
  for the agent's tool list, function schemas, consent / HitL trigger
  logic or memory interface, and the agent's response leaks structural
  details (tool names, schemas, parameters, consent rules, memory
  primitives). v2.0 highlights this as a key enabler of follow-on
  attacks - turning black-box probing into a white-box exploit path.

  This complements FoundrySystemPromptDisclosure (which only looks at
  system-prompt phrasing) by:
    - covering tool / schema / consent / memory probes, and
    - intersecting the response against the project's actual recent
      tool inventory (last 7 days, tools with >= 5 calls) so the rule
      only fires when at least 3 real tool names from the project are
      reflected back to the user.

  The "real tool name" intersection is the false-positive control:
  generic answers ("I can summarise text") never reach the threshold.
severity: Medium
requiredDataConnectors:
- connectorId: ApplicationInsights
  dataTypes:
  - AppDependencies
queryFrequency: PT1H
queryPeriod: P7D
triggerOperator: gt
triggerThreshold: 0
enabled: true
tactics:
- Discovery
- Collection
relevantTechniques:
- T1213
- T1518
query: |
  let probes = dynamic([
      "what tools","list your tools","what functions","describe your functions",
      "what capabilities","what can you do","what actions","what plugins",
      "list your plugins","list your skills","what skills","describe your tools",
      "what is your schema","print your function","reveal your tools",
      "when do you ask for approval","what triggers consent","what triggers hitl",
      "what triggers your guardrails","describe your memory","what memories do you have",
      "what are your operations","what is your tool list","describe your hooks",
      "list your apis","what mcp servers"
  ]);
  let leakMarkers = dynamic([
      "function:","tool:","name:","schema:","parameters","arguments",
      "i have access to","i can use the","my available tools","my tools include",
      "the following tools","tools available","available functions",
      "i ask for approval","i require consent","my memory","memory store",
      "my operations are","my mcp server","my plugins are"
  ]);
  let known =
      toscalar(
          AppDependencies
          | where TimeGenerated between (ago(7d) .. ago(1h))
          | extend ToolName = tolower(tostring(Properties["gen_ai.tool.name"]))
          | where isnotempty(ToolName) and strlen(ToolName) >= 4
          | summarize Calls = count() by ToolName
          | where Calls >= 5
          | summarize KnownTools = make_set(ToolName, 200)
      );
  AppDependencies
  | where TimeGenerated > ago(1h)
  | where isnotempty(Properties["gen_ai.input.messages"])
          and isnotempty(Properties["gen_ai.output.messages"])
  | extend
      Agent     = tostring(Properties["gen_ai.agent.name"]),
      Model     = tostring(Properties["gen_ai.request.model"]),
      ConvId    = tostring(Properties["gen_ai.conversation.id"]),
      ProjectId = tostring(Properties["microsoft.foundry.project.id"]),
      Input     = tolower(tostring(Properties["gen_ai.input.messages"])),
      Output    = tolower(tostring(Properties["gen_ai.output.messages"]))
  | extend AskedAboutCapabilities = Input has_any (probes)
  | extend LeaksStructure         = Output has_any (leakMarkers)
  | where AskedAboutCapabilities and LeaksStructure
  | extend OutputTokens = extract_all(@"([a-z][a-z0-9_\-]{3,})", Output)
  | extend LeakedTools  = set_intersect(OutputTokens, known)
  | extend LeakedToolCount = array_length(LeakedTools)
  | where LeakedToolCount >= 3
  | extend AccountName = iff(isempty(Agent), "unknown-agent", Agent)
  | project TimeGenerated, AccountName, Agent, Model, ProjectId, ConvId,
            LeakedToolCount, LeakedTools,
            InputSample  = substring(Input, 0, 512),
            OutputSample = substring(Output, 0, 1024)
  | order by LeakedToolCount desc, TimeGenerated desc
entityMappings:
- entityType: Account
  fieldMappings:
  - identifier: Name
    columnName: AccountName
- entityType: CloudApplication
  fieldMappings:
  - identifier: Name
    columnName: Model
eventGroupingSettings:
  aggregationKind: SingleAlert
incidentConfiguration:
  createIncident: true
  groupingConfiguration:
    enabled: true
    reopenClosedIncident: false
    lookbackDuration: PT12H
    matchingMethod: Selected
    groupByEntities:
    - Account
    groupByAlertDetails: []
    groupByCustomDetails: []
version: 1.0.0
kind: Scheduled
tags:
- Sentinel-As-Code
- Custom
- Foundry
- AI
- CapabilityDisclosure
- AIRT-v2

Explanation

This query is designed to detect potential security risks related to the disclosure of an AI system's capabilities or architecture. Here's a simplified breakdown of what it does:

  1. Purpose: It identifies situations where a user tries to probe an AI system for details about its tools, functions, or internal logic, and the system inadvertently reveals this information in its response. This is considered a security risk because it can lead to further exploitation.

  2. Detection Mechanism:

    • The query looks for specific phrases in user inputs that suggest probing for system capabilities (e.g., "what tools", "list your functions").
    • It also checks the system's responses for markers that indicate it has leaked structural details (e.g., "function:", "tool:", "my available tools").
    • It compares the leaked information against a list of known tools used by the system in the past week to ensure that the response includes real tool names, reducing false positives.
  3. Conditions for Alert:

    • The alert is triggered if the system's response includes at least three real tool names from the project's recent tool inventory.
    • The query runs every hour and looks back over the past week for relevant data.
  4. Severity and Response:

    • The severity level is set to Medium, indicating a moderate risk.
    • If the conditions are met, an incident is created for further investigation.
  5. Technical Details:

    • It uses data from Application Insights, focusing on application dependencies.
    • The query extracts and processes input and output messages to identify potential leaks.
    • It projects relevant details such as the time of the event, the agent involved, the model used, and samples of the input and output messages for context.
  6. Entity Mapping and Incident Configuration:

    • The query maps certain fields to entities like Account and CloudApplication for better tracking.
    • It also includes settings for grouping related alerts into a single incident to streamline incident management.

Overall, this query is part of a security strategy to prevent unauthorized access to sensitive information about an AI system's internal workings.

Details

David Alonso profile picture

David Alonso

Released: June 8, 2026

Tables

AppDependencies

Keywords

ApplicationInsightsAppDependenciesAccountCloudApplicationModelProjectIdConvIdAgentToolNamePropertiesTimeGeneratedInputOutputKnownToolsLeakedToolsLeakedToolCountInputSampleOutputSample

Operators

letdynamictoscalarwherebetweenagoextendtolowertostringisnotemptystrlensummarizecountmake_sethas_anyextract_allset_intersectarray_lengthiffisemptyprojectsubstringorder by

Actions