I'm trying to only select rows where the trends.insights_taxonomy
column value occurs over X times. I've been avoiding COUNT() as I do not what to do any grouping, I want all the correlating rows to remain unique.
I'm trying to weed out outliers, so for example, if I had a database of 100k peoples favorite colors, I want to ignore colors that occur less than 50 times.
Is this where a subquery would come in?
SELECT insights.industry,insights.city,insights.country,metrics.engagements,metrics.number_of_people_at_company, trends.insights_taxonomy,
FROM production.scores.api_company,
UNNEST(insights) AS insights,
UNNEST(metrics) AS metrics,
UNNEST(trends) AS trends
WHERE insights.industry <> ""
AND insights.city <> ""
AND insights.country <> ""
AND metrics.number_of_people_at_company > 0
AND metrics.engagements > 10
Not sure the best way to format this, but the top row is the column labels and the second row are the values. In this case I only want rows where Cisco Systems occurs more than X times.
industry | city | country | engagements | people_at_company | taxonomy
Legal Counsel and Prosecution | Madison | United States | 11 | 5 | Cisco Systems
If you don't want to group your resulting data, then you need to determine your qualifying rows before you get your resulting data. Write a grouping query to determine the qualifying rows, and then you can either JOIN the data set against your query above to gather everything without groupings, or perform a WHERE x IN (your grouping subquery returning valid things you want to see the complete data for).
I figured it out using a sub query, hopefully this is helpful for someone else.
SELECT insights.industry,insights.city,insights.country,metrics.engagements,metrics.number_of_people_at_company, trends.insights_taxonomy, trends.total_interactions
FROM production.scores.api_company,
UNNEST(insights) AS insights,
UNNEST(metrics) AS metrics,
UNNEST(trends) AS trends
WHERE trends.insights_taxonomy IN
(SELECT trends.insights_taxonomy
FROM production.scores.api_company,
UNNEST(trends) AS trends
GROUP BY insights_taxonomy
HAVING count(*) > 100)
AND insights.city <> ""
AND insights.country <> ""
AND metrics.number_of_people_at_company > 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.