繁体   English   中英

为未嵌套的重复记录生成关系代数

[英]Generating relational algebra for unnested repeated records

如何为看起来像这样的 SQL stmt 生成关系代数?

SELECT
  id,
  (
  SELECT
    h.is_active
  FROM
    UNNEST(history.all_of_history) h
  WHERE
    start_date <= "2021-06-01" AND (end_date >= "2021-06-01" OR end_date IS NULL))
FROM
  `table`

table有两列: idhistory history是一条记录, history.all_of_history是一个重复的记录,具有三个字段( is_activestart_dateend_date

假设您只对逻辑计划感兴趣,您提到的查询的关系代数如下:

LogicalProject(id=[$0], $f1=[$SCALAR_QUERY({
LogicalProject(is_active=[$0])
  LogicalFilter(condition=[AND(<=($1, 2021-06-01), >=($2, 2021-06-01))])
    LogicalTableFunctionScan(invocation=[UNNEST($1)], rowType=[RecordType(BOOLEAN is_active, DATE start_date, DATE end_date)])
})])
  LogicalTableScan(table=[[scott, my_table]])

以下是有关如何创建 RelNode 的高级概述:

  1. 您需要首先定义表的行类型,即my_table

    1.1 你的行应该是这样的:

    RecordType(INTEGER id, RecordType:peek(BOOLEAN is_active, DATE start_date, DATE end_date) MULTISET history)

    1.2 需要观察history 它需要遵守UNNEST sql 函数的定义。 只是为了解释这篇文章,我考虑了MULTISET数据类型。

    1.3 让我们假设你只对逻辑计划感兴趣,那么你可以创建一个简单的表可以如下:

    Table tableImpl = new AbstractTable() {
      @Override
      public RelDataType getRowType(RelDataTypeFactory typeFactory) {
        RelDataType id = typeFactory.createSqlType(SqlTypeName.INTEGER);
        List<RelDataType> historyRecordDataTypes = Arrays.asList(
            typeFactory.createSqlType(SqlTypeName.BOOLEAN),
            typeFactory.createSqlType(SqlTypeName.DATE),
            typeFactory.createSqlType(SqlTypeName.DATE)
            );
        List<String> historyRecordFieldNames = Arrays.asList("is_active", "start_date", "end_date");
        RelDataType history = typeFactory.createStructType(StructKind.PEEK_FIELDS, historyRecordDataTypes, historyRecordFieldNames);
        return typeFactory.createStructType(Arrays.asList(id, typeFactory.createMultisetType(history, -1L)), Arrays.asList("id", "history"));
      }
    };
  1. 创建表并定义其rowType ,您需要在schemPlus填充它。 让我们将上面创建的表添加到方解石测试套件中存在的模式scott中。
     rootSchema.getParentSchema().getSubSchema("scott")).add("my_table", tableImpl)
  2. 创建自定义表并将其填充到schemaPlus ,您首先需要创建my_table表扫描
    builder.scan("table_impl");
  3. 然后你需要创建unnest子查询的rel。 RelBuilder#functionScan(...)您需要使用RelBuilder#functionScan(...) API。 根据帖子中提到的查询, unnestRelNode将如下所示:
        builder.functionScan(SqlStdOperatorTable.UNNEST, 0, builder.field("history"));
        List<RexNode> innerQueryFilterPredicates = Arrays.asList(
            // start_date <= "2021-06-01"
            builder.call(
                SqlStdOperatorTable.LESS_THAN_OR_EQUAL,
                builder.field("start_date"), builder.getRexBuilder().makeDateLiteral(new DateString("2021-06-01"))),
            // end_date >= "2021-06-01" OR end_date IS NULL
            builder.call(
                SqlStdOperatorTable.OR,
                // end_date >= "2021-06-01"
                builder.call(
                    SqlStdOperatorTable.GREATER_THAN_OR_EQUAL,
                    builder.field("end_date"), builder.getRexBuilder().makeDateLiteral(new DateString("2021-06-01"))
                ),
                // end_date IS NULL
                builder.call(
                    SqlStdOperatorTable.IS_NULL, builder.field("end_date")
                )
            )
    
        );
        RelNode unnestRel = builder.filter(builder.call(SqlStdOperatorTable.AND, innerQueryFilterPredicates))
               .project(builder.field("is_active"))
               .build();
  1. 最后,为了创建最终的 RelNode,您需要将unnestRel变成一个标unnestRel查询:
    RelNode finalRel = builder.project(builder.field("id"), builder.scalarQuery(b -> unnestRel))
           .build();

如果您觉得这有用,请点赞 :)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM