SQL Count 在 GBQ 表中的不同行数

Question

I'd like to count the number of distinct rows in a table.我想计算表中不同行的数量。 I know that I can do that using groupby or by naming all the columns one by one, but would like to just do:我知道我可以使用 groupby 或通过一一命名所有列来做到这一点，但我只想这样做：

select count(distinct *) from my_table

Is that possible?那可能吗？

Answer 1

Do SELECT DISTINCT in a derived table (the subquery), then count the number of rows returned.在派生表（子查询）中执行SELECT DISTINCT ，然后计算返回的行数。

select count(*) from
(select distinct * from my_table) dt

(Doesn't your table have any primary key?) （你的表没有任何主键吗？）

Answer 2

You can use to_json_string() :您可以使用to_json_string() ：

select count(distinct to_json_string(t))
from t;

Answer 3

Below more options for BigQuery Standard SQL下面是 BigQuery Standard SQL 的更多选项

select count(distinct format('%t', t))
from `project.dataset.table` t

depends on your use case - approximate count can be even more optimal option取决于您的用例 - 近似计数可能是更好的选择

select approx_count_distinct(format('%t', t))
from `project.dataset.table` t

APPROX_COUNT_DISTINCT - returns the approximate result for COUNT(DISTINCT expression). APPROX_COUNT_DISTINCT - 返回 COUNT(DISTINCT 表达式) 的近似结果。 The value returned is a statistical estimate—not necessarily the actual value.返回的值是统计估计值——不一定是实际值。 This function is less accurate than COUNT(DISTINCT expression), but performs better on huge input .此函数不如 COUNT(DISTINCT expression) 准确，但在大量输入上表现更好。

Answer 4

The use of count(distinct *) is not permitted.不允许使用count(distinct *) 。

Alternatively you could explicitly name the columns (what defines uniqueness).或者，您可以明确命名列（定义唯一性的内容）。

SQL Count 在 GBQ 表中的不同行数

问题描述

4 个解决方案

解决方案1
3 2020-11-17 20:16:50

解决方案2
2 已采纳 2020-11-17 20:35:01

解决方案3
1 2020-11-17 20:58:14

解决方案4
0 2020-11-17 20:20:25

SQL Count 在 GBQ 表中的不同行数

问题描述

4 个解决方案

解决方案1 3 2020-11-17 20:16:50

解决方案2 2 已采纳 2020-11-17 20:35:01

解决方案3 1 2020-11-17 20:58:14

解决方案4 0 2020-11-17 20:20:25

解决方案1
3 2020-11-17 20:16:50

解决方案2
2 已采纳 2020-11-17 20:35:01

解决方案3
1 2020-11-17 20:58:14

解决方案4
0 2020-11-17 20:20:25