简体   繁体   English

获取行数据的计数

[英]Getting counts of row data

There's a table variable (I'll write it as a regular table here)有一个表变量(我将在这里将其写为常规表)

CREATE TABLE TEST (memberid int, producttype varchar(7))

This table has hundreds of thousands of rows, but for this example I've added a lot less该表有数十万行,但在本例中,我添加的行数要少得多

Insert into test values(1,'book')
Insert into test values(1,'clothes')
Insert into test values(2,'book')
Insert into test values(3,'book')
Insert into test values(4,'clothes')
Insert into test values(5,'book')
Insert into test values(5,'clothes')
Insert into test values(6,'book')
Insert into test values(7,'book')

I need to get:我需要得到:

  • the memberids that have 'book' only只有“书”的成员
  • the memberids that have 'clothes' only只有“衣服”的成员
  • the memberids that have both 'book' & 'clothes'同时拥有“书”和“衣服”的成员

eg例如

Member     Book      Clothes      Both
  1          0          0           1
  2          1          0           0
  3          1          0           0
  4          0          1           0
  5          0          0           1
  6          1          0           0 
  7          1          0           0

I had managed to get it to work with sub-queries, but because of the size of the table it could take over 2 minutes to run.我设法让它与子查询一起工作,但由于表的大小,它可能需要超过 2 分钟才能运行。

I would appreciate if anyone knows a better way to achieve this?如果有人知道实现这一目标的更好方法,我将不胜感激?

One method uses conditional aggregation:一种方法使用条件聚合:

select 
    memberid,
    case when max(producttype) = 'book' then 1 else 0 end book,
    case when min(producttype) = 'clothes' then 1 else 0 end clothes,
    case when min(producttype) <> max(producttype) then 1 else 0 end both
from test
group by memberid

This works because there are only two possible producttype s.这是有效的,因为只有两种可能的producttype If you actually have more, then you need some expressions that are more complicated (and possibly more efficient), such as:如果您实际上有更多,那么您需要一些更复杂(并且可能更有效)的表达式,例如:

case when count(*) = sum(case when producttype = 'book' then 1 end)
    then 1
    else 0
end book

Use a CTE to get if each member has book and/or clothes :使用 CTE 获取每个成员是否有book和/或clothes

with cte as (
  select memberid,
    count(distinct case when producttype = 'book' then 1 end) book_flag,
    count(distinct case when producttype = 'clothes' then 1 end) clothes_flag
  from test 
  group by memberid
)
select memberid,
  case when book_flag > clothes_flag then 1 else 0 end book,
  case when clothes_flag > book_flag then 1 else 0 end clothes,
  book_flag * clothes_flag both
from cte

See the demo .请参阅演示
Results:结果:

> memberid | book | clothes | both
> -------: | ---: | ------: | ---:
>        1 |    0 |       0 |    1
>        2 |    1 |       0 |    0
>        3 |    1 |       0 |    0
>        4 |    0 |       1 |    0
>        5 |    0 |       0 |    1
>        6 |    1 |       0 |    0
>        7 |    1 |       0 |    0

A table variable with hundreds of thousands of rows is going to be problematic for you.具有数十万行的表变量对您来说将是有问题的。

If you check your query plan, you'll likely see that the optimizer expects that table variable to only contain one row.如果您检查查询计划,您可能会看到优化器希望该表变量只包含一行。

Changing the structure to a local temp table, and perhaps adding an index to producttype , should significantly improve the performance of the query even before you optimize your code.甚至在优化代码之前,将结构更改为本地临时表,并可能向producttype添加索引,应该会显着提高查询的性能。

CREATE TABLE #TEST (memberid int, producttype varchar(7));

CREATE NONCLUSTERED INDEX tempTest ON #TEST(producttype);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM