簡體   English   中英

使 2 個數據幀中的結構數組相同( Java Spark )

[英]Make array of struct in 2 dataframes identical ( Java Spark )

我有兩個數據框( Dataset<Row> )具有相同的列,但結構的順序數組不同。

df1:

root
|-- root: string (nullable = false)
|-- array_nested: array (nullable = false)
|    |-- element: struct (containsNull = true)
|    |    |-- array_id: integer (nullable = false)
|    |    |-- array_value: string (nullable = false)

+----+------------+
|root|array_nested|
+----+------------+
|One |[[1, 1-One]]|
+----+------------+

df2:

root
|-- root: string (nullable = false)
|-- array_nested: array (nullable = false)
|    |-- element: struct (containsNull = true)
|    |    |-- array_value: string (nullable = false)
|    |    |-- array_id: integer (nullable = false)


+----+------------+
|root|array_nested|
+----+------------+
|Two |[[2-Two, 2]]|
+----+------------+

我想讓架構相同,但是當我嘗試我的方法時,它會生成額外的數組:

List<Column> updatedStructNames = new ArrayList<>();
updatedStructNames.add(col("array_nested.array_id"));
updatedStructNames.add(col("array_nested.array_value"));

Column[] updatedStructNameArray = updatedStructNames.toArray(new Column[0]);
Dataset<Row> df3 = df2.withColumn("array_nested", array(struct(updatedStructNameArray)));

它將生成這樣的模式:

root
 |-- root: string (nullable = false)
 |-- array_nested: array (nullable = false)
 |    |-- element: struct (containsNull = false)
 |    |    |-- array_id: array (nullable = false)
 |    |    |    |-- element: integer (containsNull = true)
 |    |    |-- array_value: array (nullable = false)
 |    |    |    |-- element: string (containsNull = true)

+----+----------------+
|root|array_nested    |
+----+----------------+
|Two |[[[2], [2-Two]]]|
+----+----------------+

我怎樣才能實現相同的架構?

您可以使用transform function 來更新array_nested列的結構元素:

Dataset < Row > df3 = df2.withColumn(
    "array_nested",
    expr("transform(array_nested, x -> struct(x.array_id as array_id, x.array_value as array_value))")
);

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM