简体   繁体   English

有没有办法在 Databricks SQL (Spark SQL) 中按表、模式和目录计算行数?

[英]Is there a way to calculate the number of rows by table, schema and catalog in Databricks SQL (Spark SQL)?

I need to create a dashboard inside Databricks that summarizes the number of rows in the current workspace right now.我需要在 Databricks 中创建一个仪表板,用于汇总当前工作区中的行数。

Is there a way to create a SQL query to calculate the number of rows by table, schema, and catalog?有没有办法创建一个 SQL 查询来按表、模式和目录计算行数? The expected result would be:预期结果将是:

Catalog目录 Schema图式 Table桌子 Rows行数
example_catalog_1 example_catalog_1 Finance金融 table_example_1表_示例_1 1567000 1567000
example_catalog_1 example_catalog_1 Finance金融 table_example_2表_示例_2 67000 67000
example_catalog_2 example_catalog_2 Procurement采购 table_example_1表_示例_1 45324888 45324888
example_catalog_2 example_catalog_2 Procurement采购 table_example_2表_示例_2 89765987 89765987
example_catalog_2 example_catalog_2 Procurement采购 table_example_3表_示例_3 145000 145000

Currently, I am working on a pure SQL workflow.目前,我正在研究纯 SQL 工作流程。 So I would like to understand if it's possible to execute such an action using SQL, because as much as I know, the dashboards in Databricks do not accept PySpark Codes.所以我想了解是否可以使用 SQL 执行这样的操作,因为据我所知,Databricks 中的仪表板不接受 PySpark 代码。

I was looking for a way to do that.我正在寻找一种方法来做到这一点。 I know that it's possible to access the tables in the workspace by using system.information_schema.tables but how to use it to count to total rows for each table presented there?我知道可以使用system.information_schema.tables访问工作区中的表,但是如何使用它来计算那里显示的每个表的总行数?

I was checking that via SQL Server it's possible via sys schema , dynamic query , or BEGIN...END clause.我正在通过 SQL 服务器检查是否可以通过sys schemadynamic queryBEGIN...END子句。 I couldn't find a way in Databricks to do that.我在 Databricks 中找不到这样做的方法。

I strongly doubt if you can run that kind of query in the databricks dashboard.我非常怀疑您是否可以在数据块仪表板中运行这种查询。 The link shared by @Sharma is more as to how to get the record count using dataframe and not how to link that with the databricks dashboard. @Sharma 共享的链接更多是关于如何使用 dataframe 获取记录计数,而不是如何将其与数据块仪表板链接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM