Query Details

Copilot Cross Tenant Guest Leak

Query

id: a05c43b4-d18f-4476-5293-7f2041528394
name: Microsoft 365 Copilot - Cross-tenant data leak via guest-shared chats
description: |
  Hunts for Microsoft 365 Copilot conversations that mix internal grounding
  sources with guest participants from external tenants — the channel
  through which sensitive RAG content most often leaks across the tenant
  boundary.

  Triggers when an agent in a chat with at least one guest / external user:
    - retrieves grounding from internal SharePoint / OneDrive / mailboxes,
      AND
    - emits a response containing the same internal URIs, file names, or
      sensitivity-label terms.

  Pair with CopilotSensitivityLabelDowngrade for label-stripping cases.
query: |
  let internalHostHints = dynamic([
      ".sharepoint.com", "-my.sharepoint.com", "/personal/",
      "exchange.microsoft", "outlook.office", "graph.microsoft.com"
  ]);
  let highLabelHints = dynamic([
      "confidential", "highly confidential", "restricted",
      "internal only", "secret"
  ]);
  CopilotActivity
  | where TimeGenerated > ago(7d)
  | where RecordType == "CopilotInteraction"
  | extend
      Participants     = tostring(coalesce(column_ifexists('Participants', ''),
                                           column_ifexists('Recipients', ''),
                                           column_ifexists('ChatParticipants', ''))),
      RagSourcesText   = tolower(tostring(LLMEventData.RagSources)),
      ResponseText     = tostring(LLMEventData.Response),
      ResponseLower    = tolower(tostring(LLMEventData.Response)),
      ChatId           = tostring(coalesce(column_ifexists('ChatId', ''),
                                           LLMEventData.ConversationId)),
      HomeTenantId     = tostring(TenantId)
  // Extract guest UPNs / domains
  | extend GuestEmails = extract_all(@"([A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,})", Participants)
  | mv-apply g = GuestEmails to typeof(string) on (
        extend
            GuestDomain   = tolower(tostring(split(g, '@')[1])),
            IsGuestMarker = tolower(g) has_any (dynamic(["#ext#", "(guest)", "<guest>"]))
      | project GuestEmail = g, GuestDomain, IsGuestMarker
    )
  // Pull tenant default-domain hint to decide "external"
  | extend
      HomeDomainHint = tolower(tostring(column_ifexists('TenantDefaultDomain', '')))
  | extend
      IsExternalGuest = IsGuestMarker
          or (isnotempty(HomeDomainHint) and GuestDomain != HomeDomainHint)
  | where IsExternalGuest
  // Confirm internal grounding present
  | extend RetrievedInternal = RagSourcesText has_any (internalHostHints)
  | where RetrievedInternal
  // And the response actually carries internal content back to the guest
  | extend
      ResponseEchoesInternalUri = ResponseLower has_any (internalHostHints),
      ResponseHasHighLabel      = ResponseLower has_any (highLabelHints)
  | where ResponseEchoesInternalUri or ResponseHasHighLabel
  | summarize
      Events             = count(),
      GuestEmails        = make_set(GuestEmail, 25),
      GuestDomains       = make_set(GuestDomain, 25),
      InternalUriSample  = make_set(RagSourcesText, 5),
      ResponseSample     = any(ResponseText)
      by TimeBucket = bin(TimeGenerated, 1h),
         AgentId, AgentName, ActorName = tostring(coalesce(ActorName, column_ifexists('ActorUPN', ''))),
         ChatId, HomeTenantId,
         ResponseEchoesInternalUri, ResponseHasHighLabel
  | order by TimeBucket desc, Events desc
tactics:
  - Exfiltration
  - Collection
techniques:
  - T1567
  - T1213
tags:
  - Sentinel-As-Code
  - Custom
  - Copilot
  - AI

Explanation

This query is designed to detect potential data leaks in Microsoft 365 Copilot conversations. It specifically looks for situations where internal information is shared with external guests in chat sessions. Here's a simplified breakdown of what the query does:

  1. Purpose: The query aims to identify instances where sensitive internal information is inadvertently shared with external guests during Microsoft 365 Copilot interactions.

  2. Conditions:

    • The chat involves at least one external guest.
    • Internal data is retrieved from sources like SharePoint, OneDrive, or mailboxes.
    • The response in the chat includes internal URLs, file names, or terms indicating high sensitivity.
  3. Process:

    • It checks for chat activities in the last 7 days.
    • It identifies participants in the chat and extracts email domains to determine if they are external guests.
    • It verifies if internal data was accessed during the chat.
    • It checks if the response to the guest contains internal information or sensitive labels.
  4. Output:

    • The query summarizes the number of such events, lists guest emails and domains, and provides samples of the internal data and responses shared.
    • It organizes the results by time, agent, and chat details, highlighting potential data leaks.
  5. Security Context:

    • The query is part of efforts to prevent data exfiltration and unauthorized collection of sensitive information.
    • It uses specific tactics and techniques related to data exfiltration and collection as defined in cybersecurity frameworks.

Overall, this query helps organizations monitor and prevent unintentional sharing of sensitive internal information with external parties through Microsoft 365 Copilot chats.

Details

David Alonso profile picture

David Alonso

Released: May 20, 2026

Tables

CopilotActivity

Keywords

MicrosoftSharePointOneDriveExchangeOutlookGraphCopilotChatParticipantsTenantEmailDomainUriLabelActorName

Operators

letdynamicwhereextendtostringcoalescecolumn_ifexiststolowerextract_allmv-applysplitprojectisnotemptyhas_anysummarizecountmake_setanybinorder bydesc

Actions