簡體   English   中英

pyspark 是否有 org.apache.spark.functions.transform 的等價物?

[英]Does pyspark have an equivalent of org.apache.spark.functions.transform?

org.apache.spark.functions.transform applies a function to each element of an array (new in spark 3.0) However, the pyspark docs don't mention an equivalent function

(有 pyspark.sql.DataFrame.transform - 但它用於轉換數據幀,而不是數組元素)

編輯:

為避免 UDF,您可以使用 F.expr('transform...'):

import pyspark.sql.functions as F
from pyspark.sql.types import IntegerType

df = spark.createDataFrame([[[1,2]],[[3,4]]]).toDF('col')
df.show()
+------+
|   col|
+------+
|[1, 2]|
|[3, 4]|
+------+

df.select(F.expr('transform(col, x -> x+1)').alias('transform')).show()
+---------+
|transform|
+---------+
|   [2, 3]|
|   [4, 5]|
+---------+

老答案:

IIUC,我認為相當於UDF。 x+1是要應用的 function。

import pyspark.sql.functions as F
from pyspark.sql.types import IntegerType

add = F.udf(lambda arr: [x+1 for x in arr], ArrayType(IntegerType()))
df.select(add('col')).show()
+-------------+
|<lambda>(col)|
+-------------+
|       [2, 3]|
|       [4, 5]|
+-------------+

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM