简体   繁体   中英

What is the relevant rules of Flink Window TVF and CEP SQL?

I am trying to parse Flink windowing TVF sql column level lineage, I initial a custom FlinkChainedProgram and set some Opt rules.

Mostly works fine except Window TVF SQL and CEP SQL.

for example, I get a logical plan as

insert into sink_table(f1, f2, f3, f4) 
       SELECT cast(window_start as String),
              cast(window_start as String), 
              user_id, 
              cast(SUM(price) as Bigint) 
       FROM TABLE(TUMBLE(TABLE source_table, DESCRIPTOR(event_time), INTERVAL '10' MINUTES))
       GROUP BY window_start, window_end, GROUPING SETS ((user_id), ());

rel#1032:FlinkLogicalCalc.LOGICAL.any.None: 0.[NONE].[NONE](input=FlinkLogicalAggregate#1030,select=CAST(window_start) AS EXPR$0, CAST(window_start) AS EXPR$1, null:BIGINT AS EXPR$2, user_id, null:VARCHAR(2147483647) CHARACTER SET "UTF-16LE" AS EXPR$4, CAST($f4) AS EXPR$5)

As we seen, the Optimized RelNode Tree contains null column so that MetadataQuery can't get origin column info.

What rules should I set in Logical Optimized phase to parse Window TVF SQL and CEP SQL? Thanks

I solved the field blood relationship method of Flink CEP SQL, and added the getColumnOrigins(Match rel, RelMetadataQuery mq, int iOutputColumn) method in org.apache.calcite.rel.metadata.org.apache.calcite.rel.metadata. RelMdColumnOrigins

/**
 * Support field blood relationship of CEP.
 * The first column is the field after PARTITION BY, and the other columns come from the measures in Match
 */
public Set<RelColumnOrigin> getColumnOrigins(Match rel, RelMetadataQuery mq, int iOutputColumn) {
    if (iOutputColumn == 0) {
        return mq.getColumnOrigins(rel.getInput(), iOutputColumn);
    }
    final RelNode input = rel.getInput();
    RexNode rexNode = rel.getMeasures().values().asList().get(iOutputColumn - 1);

    RexPatternFieldRef rexPatternFieldRef = searchRexPatternFieldRef(rexNode);
    if (rexPatternFieldRef != null) {
        return mq.getColumnOrigins(input, rexPatternFieldRef.getIndex());
    }
    return null;
}

private RexPatternFieldRef searchRexPatternFieldRef(RexNode rexNode) {
    if (rexNode instanceof RexCall) {
        RexNode operand = ((RexCall) rexNode).getOperands().get(0);
        if (operand instanceof RexPatternFieldRef) {
            return (RexPatternFieldRef) operand;
        } else {
            // recursive search
            return searchRexPatternFieldRef(operand);
        }
    }
    return null;
}

Source address: https://github.com/HamaWhiteGG/flink-sql-lineage/blob/main/src/main/java/org/apache/calcite/rel/metadata/RelMdColumnOrigins.java

I have given detailed test cases, you can refer to: https://github.com/HamaWhiteGG/flink-sql-lineage/blob/main/src/test/java/com/dtwave/flink/lineage/cep/CepTest.java

flink cep sql test case

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM