简体   繁体   English

如何计算明细表中的唯一组合?

[英]How do I count unique combinations from a detail table?

I need to write a query in SQL to count the number of unique combinations of record. 我需要在SQL中编写查询以计算记录的唯一组合数。 I have a table of items with a child table listing options for each item. 我有一个项目表,其中一个子表列出了每个项目的选项。 Each item may have 0 to x number of options. 每个项目可能有0到x个选项。 I want to count how many of each combinations there are. 我想计算每个组合有多少个。 I thought I could take the child table and transpose it using pivot and unpivot, but I haven't figured it out. 我以为可以带子表并使用支点和不支点对其进行转置,但是我还没有弄清楚。 I then tried creating a list of the combinations, but I don't know how to count the occurrences. 然后,我尝试创建组合列表,但我不知道如何计算出现次数。 Can someone show me how to do this or point me in the right direction? 有人可以告诉我如何执行此操作或为我指明正确的方向吗?

Here is the table I want to use: 这是我要使用的表:

Item   |  Option 
----------------
1      |  A
1      |  B
2      |  B
3      |  B
4      |  B
4      |  C
5      |  A
6      |  A
6      |  B
6      |  C
7      |  A
7      |  B
7      |  C
8      |  A
8      |  B
9      |  A
10     |  A
10     |  B

The results I want are this: 我想要的结果是这样的:

Option 1  | Option 2  |  Option 3  |  Count
--------------------------------------------
A         | B         |            |  3       * 1, 8, 10
B         |           |            |  2       * 2, 3
B         | C         |            |  1       * 4
A         |           |            |  2       * 5, 9
A         | B         | C          |  2       * 6, 7

This is saying that the combination A and B occurred twice, twice B was the only option picked, B and C were picked together 1 time. 也就是说,组合A和B出现了两次,两次B是唯一被选择的选项,B和C一起被选择了1次。 (The numbers after the asterisk aren't part of the result, they're just there to show which items are being counted.) (星号后的数字不是结果的一部分,它们只是用来显示正在计算的项目。)

The closest I've come is the query below. 我最近来的是下面的查询。 It gives me the unique combinations, but doesn't tell me how many times that combination occurred: 它给了我独特的组合,但没有告诉我该组合发生了多少次:

SELECT ItemCombo, Count(*) AS ItemComboCount
FROM
(
    SELECT
        Item       
          ,STUFF((SELECT ',' + CAST(Option AS varchar(MAX))
                  FROM itemDetail a 
                  WHERE a.Item = b.Item
                  FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'),1,1,''
                  ) AS ItemCombo
    FROM itemDetail b
) AS Combos
GROUP BY ItemCombo
ORDER BY Count(*) DESC

You should group by in the inner query and also order by option so the concatenated values can be correctly grouped. 你应该group by在内部查询,也order by option使连接值可以被正确分类。

SELECT ItemCombo, Count(*) AS ItemComboCount
FROM
(
    SELECT
        Item       
          ,STUFF((SELECT ',' + CAST(Option AS varchar(MAX))
                  FROM itemDetail a 
                  WHERE a.Item = b.Item
                  ORDER BY Option
                  FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'),1,1,''
                  ) AS ItemCombo
    FROM itemDetail b
    GROUP BY item
) AS Combos
GROUP BY ItemCombo
ORDER BY Count(*) DESC

To address the additional requirement you mentioned in the comments I would add a CTE, some more XML processing and dynamic TSQL to Vamsi Prabhala's excellent answer (+1 from my side): 为了解决您在评论中提到的其他要求,我将在Vamsi Prabhala的出色回答 (我的观点为 +1)中添加CTE,更多XML处理和动态TSQL:

--create test table
create table  tmp (Item int, [Option] char(1))

--populate test table
insert into tmp values ( 1, 'A') ,( 1, 'B') ,( 2, 'B') ,( 3, 'B') ,( 4, 'B') ,( 4, 'C') ,( 5, 'A') ,( 6, 'A') ,( 6, 'B') ,( 6, 'C') ,( 7, 'A') ,( 7, 'B') ,( 7, 'C') ,( 8, 'A') ,( 8, 'B') ,( 9, 'A') ,(10, 'A') ,(10, 'B')

declare @count         int
declare @loop          int = 1
declare @dynamicColums nvarchar(max) = ''
declare @sql           nvarchar(max) = ''

--count possible values 
select @count = max(c.options_count) from (
    select count(*) as options_count from tmp group by item
) c

--build dynamic headers for all combinations
while @loop <= @count
    begin
        set @dynamicColums = @dynamicColums + ' Parts.value(N''/x['+ cast(@loop as nvarchar(max)) +']'', ''char(1)'') AS [Option ' + cast(@loop as nvarchar(max)) + '],'
        set @loop = @loop + 1
    end

--build dynamic TSQL statement
set @sql = @sql + ';WITH Splitted'
set @sql = @sql + ' AS ('
set @sql = @sql + ' SELECT ItemComboCount'
set @sql = @sql + '     ,ItemCombo'
set @sql = @sql + '     ,CAST(''<x>'' + REPLACE(ItemCombo, '','', ''</x><x>'') + ''</x>'' AS XML) AS Parts'
set @sql = @sql + ' FROM '
set @sql = @sql + '     ('
set @sql = @sql + '         SELECT ItemCombo, Count(*) AS ItemComboCount'
set @sql = @sql + '         FROM'
set @sql = @sql + '         ('
set @sql = @sql + '             SELECT'
set @sql = @sql + '                 Item       '
set @sql = @sql + '                   ,STUFF((SELECT '','' + CAST([Option] AS varchar(MAX))'
set @sql = @sql + '                           FROM tmp a '
set @sql = @sql + '                           WHERE a.Item = b.Item'
set @sql = @sql + '                           ORDER BY [Option]'
set @sql = @sql + '                           FOR XML PATH(''''), TYPE).value(''.'', ''VARCHAR(MAX)''),1,1,'''''
set @sql = @sql + '                           ) AS ItemCombo'
set @sql = @sql + '             FROM tmp b'
set @sql = @sql + '             GROUP BY item'
set @sql = @sql + '         ) AS Combos'
set @sql = @sql + '         GROUP BY ItemCombo'
set @sql = @sql + '     ) t'
set @sql = @sql + ' )'
set @sql = @sql + ' SELECT  '
set @sql = @sql + @dynamicColums
set @sql = @sql + ' ItemComboCount as [Count]'
set @sql = @sql + ' FROM Splitted' 

--execute dynamic TSQL statement
exec(@sql)

Results: 结果:

在此处输入图片说明

Now if you add another value (for example 'D') with a couple of insert statements: 现在,如果您使用几个插入语句添加另一个值(例如“ D”):

insert into tmp values ( 1, 'D')
insert into tmp values ( 7, 'D')

you'll see that new columns are dinamically generated: 您会看到新生成的列:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM