简体   繁体   中英

How can I reuse the result from other column in hive query?

I have this query that's calculating a p-value but then I want to reuse that to get the significance. I'm not sure if I can reuse that data or do I have to copy that portion.

Here's my query

  CASE WHEN value1 > 1
  AND 1.0 - IF(
    p_value1 IS NULL,
    1.0,
    p_value1
  ) >= 0.8 THEN p_value1 WHEN value1 <= 1
  AND 1.0 - IF(
    p_value2 [1.03] is NULL,
    1.0,
    p_value2 [1.03]
  ) >= 0.8 THEN p_value2 [1.03] END AS p_value, 
  ...
  FROM table1
  WHERE dt = '2018-01-01'

What's doing is it's checking if value1 is more than 1 and the the p_value1 is more than 0.8 then print p_value2 otherwise if value1 is less than 1 then get the value from p_value2[1.03] and print that as p_value.

Now the next column I want to show is 1 - p_value so I can get the significance. But I don't want to duplicate the entire CASE for the next column. It's going to look like this.

p_value, significance (1-p_value)
0.03, 0.97
0.02, 0.98

Your code is simpler as:

SELECT (CASE WHEN value1 > 1 AND 1.0 - COALESCE(p_value1, 1.0) >= 0.8
             THEN p_value1
             WHEN value1 <= 1 AND 1.0 - COALESCE(p_value2, 1.0) > 0.8
             THEN p_value2 
        END) AS p_value, 
  ...
FROM table1
WHERE dt = '2018-01-01'

You cannot use p_value in any subsequent expressions in the same SELECT (or most other clauses). You can use a CTE or subquery:

SELECT t.*, 1 - p_value
FROM (SELECT (CASE WHEN value1 > 1 AND 1.0 - COALESCE(p_value1, 1.0) >= 0.8
                   THEN p_value1
                   WHEN value1 <= 1 AND 1.0 - COALESCE(p_value2, 1.0) > 0.8
                   THEN p_value2 
              END) AS p_value, 
        ...
      FROM table1
      WHERE dt = '2018-01-01'
     ) t

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM