Problem Statement:
I have an input PCollection with following fields:
{
firstname_1,
lastname_1,
dob,
firstname_2,
lastname_2,
firstname_3,
lastname_3,
}
then I execute a Beam SQL operation such that output of resultant PCollection should be like
----------------------------------------------
name.firstname | name.lastname | dob
----------------------------------------------
firstname_1 | lastname_1 | 202009
firstname_2 | lastname_2 |
firstname_3 | lastname_3 |
-----------------------------------------------
To be precise:
array[
(firstname_1,lastname_1,dob),
(firstname_2,lastname_2,dob),
(firstname_3,lastname_3,dob)
]
Here is the code snippet where I execute Beam SQL:
PCollectionTuple tuple=
PCollectionTuple.of(new TupleTag<>("testPcollection"), testPcollection);
PCollection<Row> result = tuple
.apply(SqlTransform.query(
"SELECT array[(firstname_1,lastname_1,dob), (firstname_2,lastname_2,dob), (firstname_3,lastname_3,dob)]"));
I am not getting proper results.
Can someone guide me how to query an array of repeated field in Beam SQL?
You can take a look at this example on how to access arrays in Beam SQL - https://github.com/apache/beam/blob/d110f6b7610b26edc1eb9a4b698840b21c151847/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslNestedRowsTest.java#L234
Your SQL query has a few errors.
testPcollection
. Your SQL query does not select FROM testPcollection
. Let us assume you meant it to be FROM testPcollection
.(firstname_1, lastname_1, doc)
in both your expected output and your query. This is not any valid SQL expression.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.