简体   繁体   中英

KQL Query for Azure CDN Access Logs using the userAgent column to sort and count by device type

I am new to KQL, so bare with me. My goal is to search through the "Microsoft.Cdn/Profiles/AccessLog/Write" and first filter down to distinct IP address. I have figured out what values in the userAgent_s column that will tell me what devices are which. I am able to search for "macintosh", "ipad", and "iphone" to get the different device types.

I want to create a pie chart that shows the percentage by count of these three devices, but using only distinct IP addresses (only one of each IP address). Here are the three strings to search for in the userAgent_s column that will show which device is what: "macintosh", "ipad", and "iphone".

Here is what some of the data looks like.

TimeGenerated [Local Time]  OperationName   userAgent_s clientIp_s  Type
9/26/2022, 10:48:33.238 AM  Microsoft.Cdn/Profiles/AccessLog/Write  yourApplicationName/4.1.4 (Linux;Android 10) ExoPlayerLib/2.9.2 2405:201:f00c:2015:4fe:9d1f:f77a:c2ab   AzureDiagnostics
9/26/2022, 10:48:07.481 AM  Microsoft.Cdn/Profiles/AccessLog/Write  AppleCoreMedia/1.0.0.14G60 (iPhone; U; CPU OS 10_3_3 like Mac OS X; en_us)  2600:8801:42c:5400:f01f:d3dd:b55f:88de  AzureDiagnostics
9/26/2022, 10:48:56.714 AM  Microsoft.Cdn/Profiles/AccessLog/Write  iTunes/12.12 (Windows; Microsoft Windows 10 x64; x64) AppleWebKit/7613.2007 68.98.143.209   AzureDiagnostics
9/26/2022, 10:47:27.620 AM  Microsoft.Cdn/Profiles/AccessLog/Write  Mozilla/5.0 (Linux; Android 11; motorola one 5G ace) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Mobile Safari/537.36   2600:387:15:1637::4 AzureDiagnostics
9/26/2022, 10:47:27.793 AM  Microsoft.Cdn/Profiles/AccessLog/Write  Mozilla/5.0 (Linux; Android 11; motorola one 5G ace) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Mobile Safari/537.36   2600:387:15:1637::4 AzureDiagnostics

This is as close as I can get:

AzureDiagnostics
    | where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog"
    | extend MacOS = userAgent_s has "macintosh"
    | extend iPhone = userAgent_s has "iphone"
    | extend iPad = userAgent_s has "iPad"
    | where MacOS == true or iPad == true or iPhone == true
    | summarize Total=dcount(clientIp_s) by MacOS, iPhone, iPad
    //| summarize MacOSTotal=countif(MacOS == true),iPadTotal=countif(iPad == true),iPhoneTotal=countif(iPhone == true)
    | render table

I have also tried something like this:

let MacOSX =
    AzureDiagnostics
    | where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog"
    | where 
        userAgent_s has "macintosh" 
    | summarize MacOSX=dcount(clientIp_s) by bin(TimeGenerated,1h);
let iPhone =
    AzureDiagnostics
    | where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog"
    | where 
        userAgent_s has "iphone"
    | summarize iPhone=dcount(clientIp_s) by bin(TimeGenerated,1h);
let iPad =
    AzureDiagnostics
    | where OperationName == "Microsoft.Cdn/Profiles/AccessLog/Write" and Category == "AzureCdnAccessLog"
    | where 
        userAgent_s has "ipad"
    | summarize iPad=dcount(clientIp_s) by bin(TimeGenerated,1h);
MacOSX
| join iPad on TimeGenerated
| render columnchart

This does it as well, but I want to count the three devices.

| where userAgent_s has "iphone" or userAgent_s has "ipad" or userAgent_s has "macintosh"
| summarize count() by userAgent_s
| render piechart 

Even closer but will not let me make a piechart from it.

AzureDiagnostics
| where userAgent_s has "iphone" or userAgent_s has "ipad" or userAgent_s has "macintosh"
| summarize MacOs=dcountif(clientIp_s, userAgent_s has "macintosh"),
            iPad=dcountif(clientIp_s, userAgent_s has "ipad"),
            iPhone=dcountif(clientIp_s, userAgent_s has "iphone")

I know I am missing something so basic, but I just don't know enough.

As I said in my comment, charts are built upon tabular data.
Pie chart required two columns, category and value .
Your query however results in a pivoted form, that is, each category is stored in a different column.

PS
Log Analytics uses client_ip_s (and not clientIp_s )

// Sample data generation. Not part of the solution. 
let prob = toscalar(range i from 0 to 2 step 1 | summarize make_list(repeat(i, tolong(bin(rand() * 1000, 1)))));
let prob_len = array_length(prob);
let devices = dynamic(["macintosh", "ipad", "iphone"]);
let AzureDiagnostics = range i from 1 to 10000 step 1
| extend userAgent_s = devices[toint(prob[toint(rand(prob_len))])]
        ,client_ip_s = format_ipv4(toint(rand(0x00FFFFFF)));
// Solution starts here
let search_terms = dynamic(["macintosh", "ipad", "iphone"]);
AzureDiagnostics
| where userAgent_s has_any (search_terms)
| mv-apply search_term = search_terms to typeof(string) on (where userAgent_s has search_term)
| summarize dcount(client_ip_s) by search_term
| render piechart 
search_term dcount_client_ip_s
macintosh 5318
ipad 3616
iphone 1030

饼形图

Fiddle

For the fun of it, here is another solution

// Sample data generation. Not part of the solution. 
let prob = toscalar(range i from 0 to 2 step 1 | summarize make_list(repeat(i, tolong(bin(rand() * 1000, 1)))));
let prob_len = array_length(prob);
let devices = dynamic(["macintosh", "ipad", "iphone"]);
let AzureDiagnostics = range i from 1 to 10000 step 1
| extend userAgent_s = tostring(devices[toint(prob[toint(rand(prob_len))])])
        ,client_ip_s = format_ipv4(toint(rand(0x00FFFFFF)));
// Solution starts here
let search_terms = dynamic(["macintosh", "ipad", "iphone"]);
AzureDiagnostics
| where userAgent_s has_any (search_terms)
| extend search_term = extract(strcat(@"\b(", array_strcat(search_terms, "|"), @")\b"), 1, userAgent_s)
| summarize dcount(client_ip_s) by search_term
| render piechart 
search_term dcount_client_ip_s
iphone 4652
macintosh 4494
ipad 827

饼形图

Fiddle

If we want the data on the client IP level (there might be multiple devices under the same client IP, maybe of different types), we might want something like this:

// Sample data generation. Not part of the solution. 
let prob = toscalar(range i from 0 to 2 step 1 | summarize make_list(repeat(i, tolong(bin(rand() * 1000, 1)))));
let prob_len = array_length(prob);
let devices = dynamic(["macintosh", "ipad", "iphone"]);
let AzureDiagnostics = range i from 1 to 10000 step 1
| extend userAgent_s = tostring(devices[toint(prob[toint(rand(prob_len))])])
        ,client_ip_s = format_ipv4(toint(rand(0x00000FFF)));
// Solution starts here
let search_terms = dynamic(["macintosh", "ipad", "iphone"]);
AzureDiagnostics
| where userAgent_s has_any (search_terms)
| extend search_term = extract(strcat(@"\b(", array_strcat(search_terms, "|"), @")\b"), 1, userAgent_s)
| summarize client_devices = array_strcat(array_sort_asc(make_set(search_term)), ",") by client_ip_s
| summarize count() by client_devices
| render piechart 
client_devices count_
ipad,macintosh 874
ipad,iphone 608
ipad,iphone,macintosh 493
iphone,macintosh 133
ipad 1188
iphone 195
macintosh 276

饼形图

Fiddle

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM