[英]how to subtract count values in hive in same table same column
hi the screenshot I uploaded is table with first column post_id, score, answerCount, CommentCount hi I am stuck on a hive problem I am very noob in sql and hive I am working on stack overflow dataset, I am trying to find percentage of questions answered. 嗨,我上传的屏幕截图是带有第一列post_id,score,answerCount,CommentCount的表。嗨,我被困在一个蜂巢问题上,我在sql和hive中非常陌生。我正在处理堆栈溢出数据集,我正在尝试查找已回答问题的百分比。 what I did is I counted all the questions and counted all the questions which has been answered but I am stuck on how to subtract them 我所做的是我数了所有问题,并数了所有已回答的问题,但我坚持如何减去它们
select AnswerCount
> from posts
> LEFT JOIN posts
> ON AnswerCount = AnswerCount
> WHERE AnswerCount IS NULL;
I want the result to be count of all - count of question answered some of the answerCounts are null I did this to count answers with 我希望结果是全部-回答一些answerCounts的问题数为空我这样做是为了计算答案
`select AnswerCount
>from posts
>where AnswerCount > 0;`
here is the schema 这是架构
post_id score AnswerCount CommentCount
385106 2 NULL 0
385107 2 0 2
385108 14 NULL 4
385109 -2 NULL 3
385110 8 NULL 5
385113 -8 NULL 2
385114 16 NULL 0
385116 30 2 6
385118 -2 NULL 0
Updated my answer to clean it up. 更新了我的答案以进行清理。
This checked out: 这签出:
SELECT
CAST(( SELECT COUNT(ua.post_id) FROM posts ua
WHERE ua.AnswerCount IS NOT NULL) AS DECIMAL(3,2)) /
CAST(COUNT(t.post_id) AS DECIMAL(3,2))
FROM posts t
The query contains a sub query which selects the COUNT()
of posts where AnswerCount IS NULL
, it divides that by the total number of posts. 该查询包含一个子查询,该子查询选择AnswerCount IS NULL
的帖子的COUNT()
,将其除以帖子的总数。 The rest is to CAST
the integers to DECIMAL
since a factional result will be reported as 0
if left as an int
. 剩下的就是CAST
的整数DECIMAL
因为派系的结果将被报告为0
,如果不为int
。
SELECT SUM(if(AnswerCount IS NULL OR AnswerCount = 0, 1, 0))/COUNT(*) * 100 as Percent_unanswered
FROM posts;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.