簡體   English   中英

在 Azure Synapse 中查詢 json 文件

[英]quyering a json file in Azure Synapse

我在 Azure 存儲帳戶中有一個 json 文件,我需要使用 Synapse SQL 無服務器池進行查詢。 在運行以下查詢時,我得到了文件的前 10 個結果。 我復制了示例輸出以了解內容和架構。 我需要編寫一個查詢,以便我得到那些日志沒有system:serviceaccount:internal-services:spinnaker和 system:serviceaccounts:internal-services 的條目,時間應該在 2022-05-23T13:45:13.0000000Z\和 2022-05-23T17:45:13.0000000Z\

有人可以幫我在這里寫一個查詢。 我運行並獲得前 10 個結果的查詢在這里:

select top 10 *
from openrowset(
        bulk 'https://azdevogs.blob.core.windows.net/insights-logs-kube-audit/resourceId=/SUBSCRIPTIONS/533AEB/RESOURCEGROUPS/AZURE-TEST/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-TEST/y=2022/m=05/d=23/h=13/m=00/PT1H.json',
        format = 'csv',
        fieldterminator ='0x0b',
        fieldquote = '0x0b'
    ) with (doc nvarchar(max)) as rows
go

結果:

[{"data":[["{ \"operationName\": \"Microsoft.ContainerService/managedClusters/diagnosticLogs/Read\", \"category\": \"kube-audit\", \"ccpNamespace\": \"5f40f\", \"resourceId\": \"/SUBSCRIPTIONS/531C3AEB/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV\", \"properties\": {\"log\":\"{\\\"kind\\\":\\\"Event\\\",\\\"apiVersion\\\":\\\"audit.k8s.io/v1\\\",\\\"level\\\":\\\"Metadata\\\",\\\"auditID\\\":\\\"b7bca3\\\",\\\"stage\\\":\\\"ResponseComplete\\\",\\\"requestURI\\\":\\\"/apis/chaos-mesh.org/v1alpha1/namespaces/velero/httpchaos?limit=500\\\",\\\"verb\\\":\\\"list\\\",\\\"user\\\":{\\\"username\\\":\\\"system:serviceaccount:internal-services:spinnaker\\\",\\\"uid\\\":\\\"3feceb35e\\\",\\\"groups\\\":[\\\"system:serviceaccounts\\\",\\\"system:serviceaccounts:internal-services\\\",\\\"system:authenticated\\\"]},\\\"sourceIPs\\\":[\\\"35.205.140.108\\\"],\\\"userAgent\\\":\\\"kubectl/v1.18.10 (linux/amd64) kubernetes/62876fc\\\",\\\"objectRef\\\":{\\\"resource\\\":\\\"httpchaos\\\",\\\"namespace\\\":\\\"velero\\\",\\\"apiGroup\\\":\\\"chaos-mesh.org\\\",\\\"apiVersion\\\":\\\"v1alpha1\\\"},\\\"responseStatus\\\":{\\\"metadata\\\":{},\\\"code\\\":200},\\\"requestReceivedTimestamp\\\":\\\"2022-05-23T13:45:13.140759Z\\\",\\\"stageTimestamp\\\":\\\"2022-05-23T13:45:13.146101Z\\\",\\\"annotations\\\":{\\\"authentication.k8s.io/legacy-token\\\":\\\"system:serviceaccount:internal-services:spinnaker\\\",\\\"authorization.k8s.io/decision\\\":\\\"allow\\\",\\\"authorization.k8s.io/reason\\\":\\\"RBAC: allowed by ClusterRoleBinding \\\\\\\"spinnaker-cluster-admin\\\\\\\" of ClusterRole \\\\\\\"cluster-admin\\\\\\\" to ServiceAccount \\\\\\\"spinnaker/internal-services\\\\\\\"\\\"}}\\n\",\"stream\":\"stdout\",\"pod\":\"kube-apiserver-76d-q68\"}, \"time\": \"2022-05-23T13:45:13.0000000Z\", \"Cloud\": \"AzureCloud\", \"Environment\": \"prod\", \"UnderlayClass\": \"hcp-underlay\", \"UnderlayName\": \"hcp-underlay-westeurope-cx-624\"}"],["{ \"operationName\": \"Microsoft.ContainerService/managedClusters/diagnosticLogs/Read\", \"category\": \"kube-audit\", \"ccpNamespace\": \"5ff040f\", \"resourceId\": \"/SUBSCRIPTIONS/531B20C3AEB/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV\", \"properties\": {\"log\":\"{\\\"kind\\\":\\\"Event\\\",\\\"apiVersion\\\":\\\"audit.k8s.io/v1\\\",\\\"level\\\":\\\"Metadata\\\",\\\"auditID\\\":\\\"f2b766d\\\",\\\"stage\\\":\\\"ResponseComplete\\\",\\\"requestURI\\\":\\\"/apis/chaos-mesh.org/v1alpha1/namespaces/velero/iochaos?limit=500\\\",\\\"verb\\\":\\\"list\\\",\\\"user\\\":{\\\"username\\\":\\\"system:serviceaccount:internal-services:spinnaker\\\",\\\"uid\\\":\\\"3fec72feb35e\\\",\\\"groups\\\":[\\\"system:serviceaccounts\\\",\\\"system:serviceaccounts:internal-services\\\",\\\"system:authenticated\\\"]},\\\"sourceIPs\\\":[\\\"35.205.140.108\\\"],\\\"userAgent\\\":\\\"kubectl/v1.18.10 (linux/amd64) kubernetes/62876fc\\\",\\\"objectRef\\\":{\\\"resource\\\":\\\"iochaos\\\",\\\"namespace\\\":\\\"velero\\\",\\\"apiGroup\\\":\\\"chaos-mesh.org\\\",\\\"apiVersion\\\":\\\"v1alpha1\\\"},\\\"responseStatus\\\":{\\\"metadata\\\":{},\\\"code\\\":200},\\\"requestReceivedTimestamp\\\":\\\"2022-05-23T13:45:13.156899Z\\\",\\\"stageTimestamp\\\":\\\"2022-05-23T13:45:13.162219Z\\\",\\\"annotations\\\":{\\\"authentication.k8s.io/legacy-token\\\":\\\"system:serviceaccount:internal-services:spinnaker\\\",\\\"authorization.k8s.io/decision\\\":\\\"allow\\\",\\\"authorization.k8s.io/reason\\\":\\\"RBAC: allowed by ClusterRoleBinding \\\\\\\"spinnaker-cluster-admin\\\\\\\" of ClusterRole \\\\\\\"cluster-admin\\\\\\\" to ServiceAccount \\\\\\\"spinnaker/internal-services\\\\\\\"\\\"}}\\n\",\"stream\":\"stdout\",\"pod\":\"kube-apiserver-768d-q68\"}, \"time\": \"2022-05-23T13:45:13.0000000Z\", \"Cloud\": \"AzureCloud\", \"Environment\": \"prod\", \"UnderlayClass\": \"hcp-underlay\", \"UnderlayName\": \"hcp-underlay-westeurope-cx-624\"}"],,"schema":[{"columnName":"doc","ordinal":0,"dataTypeName":"nvarchar"}]]}]

您可以使用openjson函數將您的 json 數組解析為表格。 這樣您就可以將數據從 json 數組中提取為關系格式。

您提供的 json 示例不是有效的 json。 確保您查詢的實際數據是有效的 json,否則您會收到JSON text is not properly formatted -error。

您的數據看起來像是來自 AKS 診斷日志/審核日志,但 json 格式不是原始日志格式。 您是否故意將其轉換為另一種結構? 對於 AKS 審核日志的原始 Azure 診斷日志結構,以下示例 SQL 查詢將導致與列用戶名和時間的關系,您將能夠根據這些進行篩選:

SELECT logs.time, logitem.username
FROM OPENROWSET(
    BULK 'https://....core.windows.net/.../PT1H.json',
    FORMAT = 'CSV',
    FIELDQUOTE = '0x0b',
    FIELDTERMINATOR ='0x0b'
)
WITH (
    jsonContent varchar(MAX)
) AS [result] cross apply openjson (jsonContent, '$') 
   with ( 
           time nvarchar(max) '$.time',
           logjson nvarchar(max) '$.properties.log'
           )  logs cross apply openjson (logs.logjson, '$') 
   with ( 
           username nvarchar(max) '$.user.username'
           )  logitem 

對於此查詢,您可以像使用普通 sql 一樣在 where 子句中添加簡單的時間和用戶名過濾器。

您可以在此處找到有關 openjson 語法的更多信息: https ://docs.microsoft.com/en-us/sql/t-sql/functions/openjson-transact-sql

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM