简体   繁体   中英

How to count distinct values in a list

I am fairly new to writing queries in Snowflake and have run into a hiccup. I am trying to count how many times an item appears in a list all in the same column.

I was able to use the flatten function and then tried to add in the count function with no luck.

Here is a dummy version of my data:

Ticket#              Tasks 
1               ["cut apple","peel orange","slice cheese"]
2               ["slice cheese","peel orange"]
3               ["cut apple"]
4               ["cut apple","slice cheese"]
5               ["cut apple", "chop kiwi"]

Here is what I want the output to look like: (hopefully auto populating the distinct list of tasks in desc order)

Tasks               Quantity
Cut Apple               4
Slice Cheese            3
Peel Orange             2
Chop Kiwi               1

Too long for a comment, but guidance for you to look into, then try to write a sample query. While you have the opportunity to do so while learning, I would look into Data Normalization and adjust your "Tasks" column.

You should have a secondary lookup table that has a primary key ID and a description of each unique task (you'll see in the data normalization). So you can follow along from your data context to the document, I will provide the layout examples and see how that helps you.

Starting with your lookup task table...

Tasks Table
TaskID   TaskDescription
1        cut apple
2        peel orange
3        slice cheese
4        chop kiwi

Then, you would have another table that has TicketID, and a third table shows multiple records for each TicketID.

Ticket Table
TicketID  ExPurchaseDate
1         someDate
2         sameDate
3         etc...

Now, a detail table per ticket.

TicketTasks Table
TicketTaskID  TicketID   TaskID
1             1          1
2             1          2
3             1          3
4             2          3
5             2          2
6             3          1
7             4          1
7             4          3
8             5          1
9             5          4

Try to digest this some with the normalization and then look into writing a sql query with COUNT(*) and GROUP BY. More than happy to help you more after, but hope this HELPs guide you some.

Step 1: Define a normalized data schema and put the schema into a database.

Your normalized data schema might look something like this:

在此输入图像描述

Step 2: Add your data

Step 3: Then you will be able to use SQL COUNT with DISTINCT to find the unique rows in your data table(s)

SQL COUNT with DISTINCT

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM