简体   繁体   中英

how to subtract count values in hive in same table same column

hi the screenshot I uploaded is table with first column post_id, score, answerCount, CommentCount hi I am stuck on a hive problem I am very noob in sql and hive I am working on stack overflow dataset, I am trying to find percentage of questions answered. what I did is I counted all the questions and counted all the questions which has been answered but I am stuck on how to subtract them

select AnswerCount
> from posts
> LEFT JOIN posts
> ON AnswerCount = AnswerCount
> WHERE AnswerCount IS NULL; 

I want the result to be count of all - count of question answered some of the answerCounts are null I did this to count answers with

`select AnswerCount
>from posts
>where AnswerCount > 0;`

here is the schema

post_id score AnswerCount CommentCount
385106    2       NULL        0
385107    2        0          2
385108   14       NULL        4
385109   -2       NULL        3
385110    8       NULL        5
385113   -8       NULL        2
385114   16       NULL        0
385116   30        2          6
385118   -2       NULL        0

Updated my answer to clean it up.

This checked out:

    SELECT 
    CAST(( SELECT COUNT(ua.post_id) FROM posts ua 
    WHERE ua.AnswerCount IS NOT NULL) AS DECIMAL(3,2)) / 
    CAST(COUNT(t.post_id) AS DECIMAL(3,2))
      FROM posts t 

The query contains a sub query which selects the COUNT() of posts where AnswerCount IS NULL , it divides that by the total number of posts. The rest is to CAST the integers to DECIMAL since a factional result will be reported as 0 if left as an int .

SELECT SUM(if(AnswerCount IS NULL OR AnswerCount = 0, 1, 0))/COUNT(*) * 100 as Percent_unanswered
      FROM posts;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM