简体   繁体   English

SQL 计数匹配

[英]SQL Count Matches

I have a list of 191 values that I want to compare to a column.我有一个包含 191 个值的列表,我想与一列进行比较。 I ultimately want to get a count of % of rows that have a value in my master list ( matches/(non-matches + NULL) ).我最终想要计算在我的主列表中具有值的行的百分比( matches/(non-matches + NULL) )。

I know I could do something like below but am wondering if this is the most efficient way?我知道我可以做类似下面的事情,但我想知道这是否是最有效的方法? Is it possible to create an array that stores the values and check against this?是否可以创建一个存储值的数组并对此进行检查? Not sure what the best practice way of doing this is given I have 191 values to check against?不确定我有 191 个值要检查的best practice方法是什么?

I am hoping to avoid dropping 191 csv into the argument since this is a hit to formatting/readability.我希望避免将 191 csv 放入参数中,因为这会影响格式化/可读性。 Is there a way to store these values inside an array or temp table so i can just drop a short hand variable into the actual query?有没有办法将这些值存储在数组或临时表中,这样我就可以将一个速记变量放入实际查询中? Or is using the below method/average method still the best way to go regardless of how many values there are to check against?还是使用以下方法/平均方法仍然是 go 的最佳方法,无论有多少值要检查?

SELECT
    SUM(CASE WHEN COALESCE(field, '') IN (COMMA SEPARATED VALUES) THEN 1 ELSE 0 END) as matches,
    COUNT(COALESCE(field)) as total_rows
FROM table

Also, I believe that COUNT(*) and COUNT(1) are blind to NULL fields so can anyone confirm if using COUNT(COALESCE(FIELD)) ensures the count includes null values from the field?另外,我相信COUNT(*)COUNT(1)NULL字段视而不见,所以任何人都可以确认使用COUNT(COALESCE(FIELD))是否确保计数包括来自字段的 null 值?

Presto doesn't support temporary tables, but you could improve readability of your query by using an inline table ( WITH clause combined with VALUES ) to avoid having a long list of values in the aggregation expression. Presto 不支持临时表,但您可以通过使用内联表(与VALUES结合的WITH子句)来提高查询的可读性,以避免在聚合表达式中出现一长串值。

Then, you can count the number of matches by doing the following.然后,您可以通过执行以下操作来计算匹配数。 Note the use of FILTER for improved readability.请注意使用FILTER以提高可读性。

count(*) FILTER (WHERE field IN (SELECT value FROM data))

Here's a complete example:这是一个完整的例子:

WITH data(value) as (VALUES
   'value1',
   'value2',
   ...
)
SELECT
    count(*) AS total_rows,
    count(*) FILTER (WHERE field IN (SELECT value FROM data)) AS matches
FROM t

I think you just want:我想你只是想要:

SELECT AVG(CASE WHEN field IN (COMMA SEPARATED VALUES) THEN 1.0 ELSE 0 END) as match_ratio
FROM table

One option uses avg() :一种选择使用avg()

select avg(case when field in (<<csv list>>) then 1.0 else 0 end) rows_ratio
from mytable

It might be simpler to use an array to pass the values:使用数组传递值可能更简单:

select avg(case when contains(<< array of values>>, field) then 1.0 else 0 end) rows_ratio
from mytable

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM