[英]How to count distinct values in a list
I am fairly new to writing queries in Snowflake and have run into a hiccup. 我在Snowflake中编写查询并且遇到了打嗝很新。 I am trying to count how many times an item appears in a list all in the same column. 我试图计算一个项目出现在同一列的列表中的次数。
I was able to use the flatten function and then tried to add in the count function with no luck. 我能够使用flatten函数,然后尝试添加计数功能,没有运气。
Here is a dummy version of my data: 这是我的数据的虚拟版本:
Ticket# Tasks
1 ["cut apple","peel orange","slice cheese"]
2 ["slice cheese","peel orange"]
3 ["cut apple"]
4 ["cut apple","slice cheese"]
5 ["cut apple", "chop kiwi"]
Here is what I want the output to look like: (hopefully auto populating the distinct list of tasks in desc order) 这就是我想要输出的样子:(希望以desc顺序自动填充不同的任务列表)
Tasks Quantity
Cut Apple 4
Slice Cheese 3
Peel Orange 2
Chop Kiwi 1
Too long for a comment, but guidance for you to look into, then try to write a sample query. 评论太长了,但指导你去研究,然后尝试编写一个示例查询。 While you have the opportunity to do so while learning, I would look into Data Normalization and adjust your "Tasks" column. 虽然您有机会在学习的同时这样做,但我会研究数据规范化并调整“任务”列。
You should have a secondary lookup table that has a primary key ID and a description of each unique task (you'll see in the data normalization). 您应该有一个辅助查找表,其中包含主键ID和每个唯一任务的描述(您将在数据规范化中看到)。 So you can follow along from your data context to the document, I will provide the layout examples and see how that helps you. 因此,您可以从数据上下文到文档,我将提供布局示例,并了解它如何帮助您。
Starting with your lookup task table... 从查找任务表开始......
Tasks Table
TaskID TaskDescription
1 cut apple
2 peel orange
3 slice cheese
4 chop kiwi
Then, you would have another table that has TicketID, and a third table shows multiple records for each TicketID. 然后,您将拥有另一个具有TicketID的表,第三个表显示每个TicketID的多个记录。
Ticket Table
TicketID ExPurchaseDate
1 someDate
2 sameDate
3 etc...
Now, a detail table per ticket. 现在,每张票的详细信息表。
TicketTasks Table
TicketTaskID TicketID TaskID
1 1 1
2 1 2
3 1 3
4 2 3
5 2 2
6 3 1
7 4 1
7 4 3
8 5 1
9 5 4
Try to digest this some with the normalization and then look into writing a sql query with COUNT(*) and GROUP BY. 尝试使用规范化消化这一些,然后研究用COUNT(*)和GROUP BY编写SQL查询。 More than happy to help you more after, but hope this HELPs guide you some. 非常乐意为您提供更多帮助,但希望此HELP可以为您提供指导。
Step 1: Define a normalized data schema and put the schema into a database. 步骤1:定义规范化数据模式并将模式放入数据库。
Your normalized data schema might look something like this: 您的规范化数据模式可能如下所示:
Step 2: Add your data 第2步:添加您的数据
Step 3: Then you will be able to use SQL COUNT with DISTINCT to find the unique rows in your data table(s) 第3步:然后,您将能够使用带有DISTINCT的SQL COUNT来查找数据表中的唯一行
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.