简体   繁体   中英

Removing all rows where a column entry is only listed once, but display all other columns too (subquery)

I would like to clean my data by removing all of a certain column that is only listed once or twice. It currently looks like this:

Fruit Year Units
apples 2018 20000
oranges 2018 600
apples/oranges 2018 3000
oranges 2017 6000
apples 2016 2000
oranges 2016 2000
apples 2017 50000
potato 2017 9000
apples/oranges 2016 5000

I would like it to look like this:

Fruit Year Units
apples 2018 20000
oranges 2018 600
apples 2017 50000
oranges 2017 6000
apples 2016 2000
oranges 2016 2000

There are a lot more Fruit single entries than this in the table in reality so I can not just exclude using a long where statement.

Attempted solution

I've tried to simplify the data by using a subquery that counts the number of times a "Fruit" entry appears, then only displays rows where this is two or more. It works as a standalone query but not in the larger query which also includes the other columns.

SELECT "Fruit"
    ,count("Fruit") as cnt
    ,"Year"
    ,"Units"
FROM example_table
WHERE(SELECT count("Fruit") as cnt
    FROM example_table
    HAVING cnt > 2)
GROUP BY "Fruit"
    ,"Year"
    ,"Units"

This is the error message I get:

Invalid data type [NUMBER(18,0)] for predicate [(SELECT COUNT(EXAMPLE_TABLE."Fruit") AS "CNT" FROM EXAMPLE_TABLE AS EXAMPLE_TABLE HAVING CNT > 2)]

One way of doing it is getting the fruit names that have more than 2 then you can select them.

SELECT * 
FROM example_table 
WHERE Fruit in (
SELECT Fruit 
FROM example_table
group by Fruit
having count(Fruit) > 2)
;


FUNCTIONS USED;

QUALIFY()

在此处输入图像描述

WITH CTE AS  
(SELECT 'apples'    FRUITS, 2018 YEAR,  20000 UNITS
UNION ALL SELECT 'oranges', 2018 YEAR,  600 UNITS
UNION ALL SELECT 'oranges', 2017 YEAR,  6000 UNITS
UNION ALL SELECT 'apples',  2016 YEAR,  2000 UNITS
UNION ALL SELECT 'oranges', 2016 YEAR,  2000 UNITS
UNION ALL SELECT 'apples',  2017 YEAR,  50000 UNITS
UNION ALL SELECT 'potato',  2017 YEAR,  9000 UNITS
UNION ALL SELECT 'apples/oranges' , 2016,   5000
UNION ALL SELECT 'apples/oranges',  2018,   3000 )

SELECT * FROM CTE 
    QUALIFY COUNT(DISTINCT YEAR)OVER(PARTITION BY FRUITS)>2;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM