简体   繁体   English

如何计算列表中的不同值

[英]How to count distinct values in a list

I am fairly new to writing queries in Snowflake and have run into a hiccup. 我在Snowflake中编写查询并且遇到了打嗝很新。 I am trying to count how many times an item appears in a list all in the same column. 我试图计算一个项目出现在同一列的列表中的次数。

I was able to use the flatten function and then tried to add in the count function with no luck. 我能够使用flatten函数,然后尝试添加计数功能,没有运气。

Here is a dummy version of my data: 这是我的数据的虚拟版本:

Ticket#              Tasks 
1               ["cut apple","peel orange","slice cheese"]
2               ["slice cheese","peel orange"]
3               ["cut apple"]
4               ["cut apple","slice cheese"]
5               ["cut apple", "chop kiwi"]

Here is what I want the output to look like: (hopefully auto populating the distinct list of tasks in desc order) 这就是我想要输出的样子:(希望以desc顺序自动填充不同的任务列表)

Tasks               Quantity
Cut Apple               4
Slice Cheese            3
Peel Orange             2
Chop Kiwi               1

Too long for a comment, but guidance for you to look into, then try to write a sample query. 评论太长了,但指导你去研究,然后尝试编写一个示例查询。 While you have the opportunity to do so while learning, I would look into Data Normalization and adjust your "Tasks" column. 虽然您有机会在学习的同时这样做,但我会研究数据规范化并调整“任务”列。

You should have a secondary lookup table that has a primary key ID and a description of each unique task (you'll see in the data normalization). 您应该有一个辅助查找表,其中包含主键ID和每个唯一任务的描述(您将在数据规范化中看到)。 So you can follow along from your data context to the document, I will provide the layout examples and see how that helps you. 因此,您可以从数据上下文到文档,我将提供布局示例,并了解它如何帮助您。

Starting with your lookup task table... 从查找任务表开始......

Tasks Table
TaskID   TaskDescription
1        cut apple
2        peel orange
3        slice cheese
4        chop kiwi

Then, you would have another table that has TicketID, and a third table shows multiple records for each TicketID. 然后,您将拥有另一个具有TicketID的表,第三个表显示每个TicketID的多个记录。

Ticket Table
TicketID  ExPurchaseDate
1         someDate
2         sameDate
3         etc...

Now, a detail table per ticket. 现在,每张票的详细信息表。

TicketTasks Table
TicketTaskID  TicketID   TaskID
1             1          1
2             1          2
3             1          3
4             2          3
5             2          2
6             3          1
7             4          1
7             4          3
8             5          1
9             5          4

Try to digest this some with the normalization and then look into writing a sql query with COUNT(*) and GROUP BY. 尝试使用规范化消化这一些,然后研究用COUNT(*)和GROUP BY编写SQL查询。 More than happy to help you more after, but hope this HELPs guide you some. 非常乐意为您提供更多帮助,但希望此HELP可以为您提供指导。

Step 1: Define a normalized data schema and put the schema into a database. 步骤1:定义规范化数据模式并将模式放入数据库。

Your normalized data schema might look something like this: 您的规范化数据模式可能如下所示:

在此输入图像描述

Step 2: Add your data 第2步:添加您的数据

Step 3: Then you will be able to use SQL COUNT with DISTINCT to find the unique rows in your data table(s) 第3步:然后,您将能够使用带有DISTINCT的SQL COUNT来查找数据表中的唯一行

SQL COUNT with DISTINCT 带有DISTINCT的SQL COUNT

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM