简体   繁体   English

Hive 访问上一行值

[英]Hive access previous row value

I have the same issue mentioned here我在这里提到了同样的问题

However, the problem is on Hive database.但是,问题出在 Hive 数据库上。 When I try the solution on my table that looks like当我在桌子上尝试解决方案时,它看起来像

Id   Date             Column1    Column2
1    01/01/2011       5          5 => Same as Column1
2    02/01/2011       2          18 => (1 + (value of Column2 from the previous row)) * (1 + (Value of Column1 from the current row)) i.e. (1+5)*(1+2)
3    03/01/2011       3          76 => (1+18)*(1+3) = 19*4

I get the error我收到错误

FAILED: SemanticException Recursive cte cteCalculation detected (cycle: ctecalculation -> cteCalculation).

What is the workaround possible in this case在这种情况下可能的解决方法是什么

You will have to write a UDF for this.您必须为此编写一个 UDF。
Below you can see a very (!!) simplified UDF for what you need.您可以在下面看到一个非常 (!!) 简化的 UDF,以满足您的需要。
The idea is to store the value from the previous execution in a variable inside the UDF and each time return (stored_value+1)*(current_value+1) and then store it for the next line.这个想法是将上一次执行的值存储在 UDF 内的一个变量中,每次返回(stored_value+1)*(current_value+1)然后将其存储到下一行。
You need to take care of the first value to get, so there is a special case for that.您需要处理要获取的第一个值,因此有一个特殊情况。
Also, you have to pass the data ordered to the function as it simply goes line by line and performs what you need without considering any order.此外,您必须将有序的数据传递给函数,因为它只是逐行执行并执行您需要的操作,而无需考虑任何顺序。

You have to add your jar and create a function, lets call it cum_mul .你必须添加你的 jar 并创建一个函数,我们称之为cum_mul

The SQL will be : SQL 将是:

select id,date,column1,cum_mul(column1) as column2
from
(select id,date,column1 from myTable order by id) a  

The code for the UDF : UDF 的代码:

import org.apache.hadoop.hive.ql.exec.UDF;

public class cum_mul extends UDF  {

    private int prevValue;
    private boolean first=true;

    public int evaluate(int value) {
        if (first) {
            this.prevValue = value;
            first = false;
            return value; 
        }
        else {
            this.prevValue = (this.prevValue+1)*(value+1);
            return this.prevValue;      
        }
      }
}

Hive common table expression (CTE) works as a query level temp-table (a syntax sugar) that is accessible within the whole SQL. Hive 公用表表达式 (CTE) 用作查询级别临时表(一种语法糖),可在整个 SQL 中访问。

Recursive query is not supported because it introduces multiple stages with massive I/O, which is something that the underlying execution and storage engine not good at.不支持递归查询,因为它引入了大量 I/O 的多个阶段,这是底层执行和存储引擎不擅长的。 In fact, Hive strictly prohibit recursive references for CTEs and views.事实上,Hive 严格禁止对 CTE 和视图进行递归引用。 Hence the error you got.因此你得到了错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM