Query Details

Data Detect Anomalous Data Ingestion

Query

//Detect anomalies in the amount of data being ingested into your Sentinel workspace

//Data connector required for this query - Usage (generated automatically on a log analytics workspace)

//Sensitivity = the lower the number the more sensitive the anomaly detection is, i.e it will find more anomalies, default is 1.5
let sensitivity = 1.5;
//Threshold = set a threshold to account for low volume anomailies, i.e moving from 1 GB of data to 2 GB. This example uses tables larger than 2 GB every 3 hours as a threshold
let threshold = 2;
//First find the anomalies by creating a series of all the data ingestion and using series_decompose_anomalies
let outliers=
Usage
| where IsBillable = true
| make-series TableSize=sum(Quantity / 1024) default=0 on TimeGenerated from ago(7d) to now() step 3h by DataType
| extend outliers=series_decompose_anomalies(TableSize, sensitivity)
| mv-expand TimeGenerated, TableSize, outliers
| where outliers == 1 and TableSize > threshold
//Optionally visualize the anomalies - remove everything below this line to just retrieve the data instead of visualizing
| distinct DataType;
Usage
| where IsBillable = true
| where DataType in (outliers)
| make-series TableSize=sum(Quantity / 1024) default=0 on TimeGenerated from ago(7d) to now() step 3h by DataType
| render timechart with (ytitle="Table Size",title="Anomalous data ingestion")

Explanation

This query is used to detect anomalies in the amount of data being ingested into a Sentinel workspace. It uses the Usage data connector and has two parameters: sensitivity and threshold. Sensitivity determines how sensitive the anomaly detection is, with lower numbers indicating more sensitivity. Threshold sets a threshold for low volume anomalies.

The query first finds anomalies by creating a series of all the data ingestion and using series_decompose_anomalies. It filters the Usage data where IsBillable is true and calculates the sum of the Quantity divided by 1024 (to convert to GB) for each DataType every 3 hours over the past 7 days. It then extends the series with the anomalies and expands the results. It filters the results to only include outliers with a value of 1 and a TableSize greater than the threshold.

Optionally, the query can visualize the anomalies by plotting a timechart of the TableSize for each DataType.

Details

Matt Zorich profile picture

Matt Zorich

Released: June 17, 2022

Tables

Usage

Keywords

Detect,Anomalies,Data,Ingested,Sentinel,Workspace,Connector,Usage,Sensitivity,Threshold,Series_decompose_anomalies,Outliers,IsBillable,TableSize,TimeGenerated,DataType,MV-expand,Render,Timechart

Operators

|,let,=,where,IsBillable,true,make-series,sum,default,on,from,ago,to,now,step,by,extend,mv-expand,==,and,>,distinct,in,render,timechart,with,ytitle,title

Actions