简体   繁体   English

转换和读取 Azure Synapse notebook 中的 json 文件

[英]Transforming and Reading json files in Azure Synapse notebook

I have below Json in one of my storage account and I am able to read it by following the below code.我的一个存储帐户中有以下 Json ,我可以按照以下代码阅读它。 I need help in reading the columns where "pod" has value "kube-apiserver-78" or "kube-apiserver-79" and username has "system:serviceaccount:xyz" or "system:serviceaccount:poq": can someone help me how can I translate it below code.我需要帮助来阅读“pod”的值为“kube-apiserver-78”或“kube-apiserver-79”且用户名的值为“system:serviceaccount:xyz”或“system:serviceaccount:poq”的列:有人可以帮忙吗我如何在代码下面翻译它。

df = spark.read.json('abfss://insights-logs-kube-audit@azogs.dfs.core.windows.net/resourceId=/SUBSCRIPTIONS/5IS/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV/y=2022/m=08/d=09/h=11/m=00/')

df.show()

Sample Json file in Storage container Which I read:我读到的存储容器中的示例 Json 文件:

{ "operationName": "Microsoft.ContainerService/managedClusters/diagnosticLogs/Read", "category": "kube-audit", "ccpNamespace": "5f", "resourceId": "/SUBSCRIPTIONS/SID/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV", "properties": {"log":"{\"kind\":\"Event\",\"apiVersion\":\"audit.k8s.io/v1\",\"level\":\"Metadata\",\"auditID\":\"b7b1ca3\",\"stage\":\"ResponseComplete\",\"requestURI\":\"/apis/chaos-mesh.org/v1alpha1/namespaces/ve/httpchaos?limit=500\",\"verb\":\"list\",\"user\":{\"username\":\"system:serviceaccount:xyz\",\"uid\":\"3eb35e\",\"groups\":[\"system:serviceaccounts\",\"system:serviceaccounts:internal-services\",\"system:authenticated\"]},\"sourceIPs\":[\"100.100.100.100\"],\"userAgent\":\"ktl/v1.18.10 (linux/amd64) kubernetes/62c\",\"objectRef\":{\"resource\":\"httpchaos\",\"namespace\":\"vo\",\"apiGroup\":\"chaos-mesh.org\",\"apiVersion\":\"v1alpha1\"},\"responseStatus\":{\"metadata\":{},\"code\":200},\"requestReceivedTimestamp\":\"2022-05-23T13:45:13.140759Z\",\"stageTimestamp\":\"2022-05-23T13:45:13.146101Z\",\"annotations\":{\"authentication.k8s.io/legacy-token\":\"system:serviceaccount:ixyzr\",\"authorization.k8s.io/decision\":\"allow\",\"authorization.k8s.io/reason\":\"RBAC: allowed by ClusterRoleBinding \\\"admin\\\" of ClusterRole \\\"cluster-admin\\\" to ServiceAccount \\\"abc/xyz\\\"\"}}\n","stream":"stdout","pod":"kube-apiserver-78"}, "time": "2022-05-23T13:45:13.0000000Z", "Cloud": "AzureCloud", "Environment": "prod", "UnderlayClass": "hcp-underlay", "UnderlayName": "h-24"}
{ "operationName": "Microsoft.ContainerService/managedClusters/diagnosticLogs/Read", "category": "kube-audit", "ccpNamespace": "5f", "resourceId": "/SUBSCRIPTIONS/SID/RESOURCEGROUPS/AZURE-DEV/PROVIDERS/MICROSOFT.CONTAINERSERVICE/MANAGEDCLUSTERS/AZURE-DEV", "properties": {"log":"{\"kind\":\"Event\",\"apiVersion\":\"audit.k8s.io/v1\",\"level\":\"Metadata\",\"auditID\":\"b7b1cax3\",\"stage\":\"ResponseComplete\",\"requestURI\":\"/apis/chaos-mesh.org/v1alpha1/namespaces/ve/httpchaos?limit=500\",\"verb\":\"list\",\"user\":{\"username\":\"system:serviceaccount:xyz\",\"uid\":\"3eb35e\",\"groups\":[\"system:serviceaccounts\",\"system:serviceaccounts:internal-services\",\"system:authenticated\"]},\"sourceIPs\":[\"100.100.100.100\"],\"userAgent\":\"ktl/v1.18.10 (linux/amd64) kubernetes/62c\",\"objectRef\":{\"resource\":\"httpchaos\",\"namespace\":\"vo\",\"apiGroup\":\"chaos-mesh.org\",\"apiVersion\":\"v1alpha1\"},\"responseStatus\":{\"metadata\":{},\"code\":200},\"requestReceivedTimestamp\":\"2022-05-23T13:45:13.140759Z\",\"stageTimestamp\":\"2022-05-23T13:45:13.146101Z\",\"annotations\":{\"authentication.k8s.io/legacy-token\":\"system:serviceaccount:ixyzr\",\"authorization.k8s.io/decision\":\"allow\",\"authorization.k8s.io/reason\":\"RBAC: allowed by ClusterRoleBinding \\\"admin\\\" of ClusterRole \\\"cluster-admin\\\" to ServiceAccount \\\"abc/xyz\\\"\"}}\n","stream":"stdout","pod":"kube-apiserver-78"}, "time": "2022-05-23T13:45:13.0000000Z", "Cloud": "AzureCloud", "Environment": "prod", "UnderlayClass": "hcp-underlay", "UnderlayName": "h-24"}

To query Json file After reading it convert it into temporal tables in Apache Spark and query them using Spark SQL.查询Json文件读取后将其转换为 Apache Spark 中的时态表,并使用 Spark SQL 查询它们。

To convert it into temporal table, use command:要将其转换为临时表,请使用命令:

df.createOrReplaceTempView("Name for temporal table")

Then query on this temporal table using Spark SQL.然后使用 Spark SQL 查询这个时态表。

SELECT * FROM "Name for temporal table"
WHERE (pod = 'kube-apiserver-78' or pod = 'kube-apiserver-79') 
and (username = 'system:serviceaccount:xyz' or username = 'system:serviceaccount:poq')

Reference: Query JSON Files with Azure Synapse Analytics Notebooks参考: 使用 Azure Synapse Analytics Notebooks 查询 JSON 文件

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 Azure Synapse Notebook 的数据框中加载许多 CSV 文件时如何忽略丢失的文件 - How to ignore missing files when loading many CSV files in a dataframe in Azure Synapse Notebook Reading Json file from Azure datalake as a file using Json.load in Azure databricks /Synapse notebooks - Reading Json file from Azure datalake as a file using Json.load in Azure databricks /Synapse notebooks 如何使用 Azure Synapse 和 pySpark 笔记本从 ADLS gen2 检索 .dcm 图像文件? - How to retrieve .dcm image files from the ADLS gen2 using Azure Synapse and pySpark notebook? 从突触 Dwh 读取表时出现 Azure Synapse 异常 - Azure Synapse Exception while reading table from synapse Dwh 如何从 Python Azure Synapse 笔记本连接到 Oracle DB? - How to connect to an Oracle DB from a Python Azure Synapse notebook? 我如何在PySpark笔记本中读取Azure Synapse中的Lake数据库 - How Do i read the Lake database in Azure Synapse in a PySpark notebook Azure Synapse Notebook 代码检索火花池标签 - Azure Synapse Notebook code to retrieve spark pool tags 从 Synapse Notebook 覆盖 Azure datalake Gen 2 中的文件会引发异常 - Overwriting a file in Azure datalake Gen 2 from Synapse Notebook throws Exception 如何在 Databricks 上将 Azure Synapse Dataframe 转换为 JSON? - How to convert Azure Synapse Dataframe into JSON on Databricks? Py4JJavaError:调用 o771.save 时出错。 Azure 突触分析笔记本 - Py4JJavaError: An error occurred while calling o771.save. Azure Synapse Analytics Notebook
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM