繁体   English   中英

转换和读取 Azure Synapse notebook 中的 json 文件

[英]Transforming and Reading json files in Azure Synapse notebook

我的一个存储帐户中有以下 Json ,我可以按照以下代码阅读它。 我需要帮助来阅读“pod”的值为“kube-apiserver-78”或“kube-apiserver-79”且用户名的值为“system:serviceaccount:xyz”或“system:serviceaccount:poq”的列:有人可以帮忙吗我如何在代码下面翻译它。

df = spark.read.json('abfss://insights-logs-kube-audit@azogs.dfs.core.windows.net/resourceId=/SUBSCRIPTIONS/5IS/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV/y=2022/m=08/d=09/h=11/m=00/')

df.show()

我读到的存储容器中的示例 Json 文件:

{ "operationName": "Microsoft.ContainerService/managedClusters/diagnosticLogs/Read", "category": "kube-audit", "ccpNamespace": "5f", "resourceId": "/SUBSCRIPTIONS/SID/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV", "properties": {"log":"{\"kind\":\"Event\",\"apiVersion\":\"audit.k8s.io/v1\",\"level\":\"Metadata\",\"auditID\":\"b7b1ca3\",\"stage\":\"ResponseComplete\",\"requestURI\":\"/apis/chaos-mesh.org/v1alpha1/namespaces/ve/httpchaos?limit=500\",\"verb\":\"list\",\"user\":{\"username\":\"system:serviceaccount:xyz\",\"uid\":\"3eb35e\",\"groups\":[\"system:serviceaccounts\",\"system:serviceaccounts:internal-services\",\"system:authenticated\"]},\"sourceIPs\":[\"100.100.100.100\"],\"userAgent\":\"ktl/v1.18.10 (linux/amd64) kubernetes/62c\",\"objectRef\":{\"resource\":\"httpchaos\",\"namespace\":\"vo\",\"apiGroup\":\"chaos-mesh.org\",\"apiVersion\":\"v1alpha1\"},\"responseStatus\":{\"metadata\":{},\"code\":200},\"requestReceivedTimestamp\":\"2022-05-23T13:45:13.140759Z\",\"stageTimestamp\":\"2022-05-23T13:45:13.146101Z\",\"annotations\":{\"authentication.k8s.io/legacy-token\":\"system:serviceaccount:ixyzr\",\"authorization.k8s.io/decision\":\"allow\",\"authorization.k8s.io/reason\":\"RBAC: allowed by ClusterRoleBinding \\\"admin\\\" of ClusterRole \\\"cluster-admin\\\" to ServiceAccount \\\"abc/xyz\\\"\"}}\n","stream":"stdout","pod":"kube-apiserver-78"}, "time": "2022-05-23T13:45:13.0000000Z", "Cloud": "AzureCloud", "Environment": "prod", "UnderlayClass": "hcp-underlay", "UnderlayName": "h-24"}
{ "operationName": "Microsoft.ContainerService/managedClusters/diagnosticLogs/Read", "category": "kube-audit", "ccpNamespace": "5f", "resourceId": "/SUBSCRIPTIONS/SID/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV", "properties": {"log":"{\"kind\":\"Event\",\"apiVersion\":\"audit.k8s.io/v1\",\"level\":\"Metadata\",\"auditID\":\"b7b1cax3\",\"stage\":\"ResponseComplete\",\"requestURI\":\"/apis/chaos-mesh.org/v1alpha1/namespaces/ve/httpchaos?limit=500\",\"verb\":\"list\",\"user\":{\"username\":\"system:serviceaccount:xyz\",\"uid\":\"3eb35e\",\"groups\":[\"system:serviceaccounts\",\"system:serviceaccounts:internal-services\",\"system:authenticated\"]},\"sourceIPs\":[\"100.100.100.100\"],\"userAgent\":\"ktl/v1.18.10 (linux/amd64) kubernetes/62c\",\"objectRef\":{\"resource\":\"httpchaos\",\"namespace\":\"vo\",\"apiGroup\":\"chaos-mesh.org\",\"apiVersion\":\"v1alpha1\"},\"responseStatus\":{\"metadata\":{},\"code\":200},\"requestReceivedTimestamp\":\"2022-05-23T13:45:13.140759Z\",\"stageTimestamp\":\"2022-05-23T13:45:13.146101Z\",\"annotations\":{\"authentication.k8s.io/legacy-token\":\"system:serviceaccount:ixyzr\",\"authorization.k8s.io/decision\":\"allow\",\"authorization.k8s.io/reason\":\"RBAC: allowed by ClusterRoleBinding \\\"admin\\\" of ClusterRole \\\"cluster-admin\\\" to ServiceAccount \\\"abc/xyz\\\"\"}}\n","stream":"stdout","pod":"kube-apiserver-78"}, "time": "2022-05-23T13:45:13.0000000Z", "Cloud": "AzureCloud", "Environment": "prod", "UnderlayClass": "hcp-underlay", "UnderlayName": "h-24"}

查询Json文件读取后将其转换为 Apache Spark 中的时态表,并使用 Spark SQL 查询它们。

要将其转换为临时表,请使用命令:

df.createOrReplaceTempView("Name for temporal table")

然后使用 Spark SQL 查询这个时态表。

SELECT * FROM "Name for temporal table"
WHERE (pod = 'kube-apiserver-78' or pod = 'kube-apiserver-79') 
and (username = 'system:serviceaccount:xyz' or username = 'system:serviceaccount:poq')

参考: 使用 Azure Synapse Analytics Notebooks 查询 JSON 文件

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM