[英]Pyspark replace strings in Spark dataframe column by using values in another column
I'd like to replace a value present in a column with by creating search string from another column 我想通过从另一列创建搜索字符串来替换列中存在的值
before id address st
1 2.PA1234.la 1234 2 10.PA125.la 125 3 2.PA156.ln 156在
id address st
之前
1 2.PA1234.la 1234 2 10.PA125.la 125 3 2.PA156.ln 156id address st
1 2.PA1234.la 1234 2 10.PA125.la 125 3 2.PA156.ln 156id address st
After
1 2.PA1234.la 1234 2 10.PA125.la 125 3 2.PA156.ln 156id address st
1 2.PA9999.la 1234 2 10.PA9999.la 125 3 2.PA9999.ln 156id address st
在
1 2.PA1234.la 1234 2 10.PA125.la 125 3 2.PA156.ln 156id address st
1 2.PA9999.la 1234 2 10.PA9999.la 125 3 2.PA9999.ln 156id address st
1 2.PA9999.la 1234 2 10.PA9999.la 125 3 2.PA9999.ln 156id address st
I tried
1 2.PA9999.la 1234 2 10.PA9999.la 125 3 2.PA9999.ln 156id address st
我尝试过
1 2.PA9999.la 1234 2 10.PA9999.la 125 3 2.PA9999.ln 156
df.withColumn("address", regexp_replace("address","PA"+st,"PA9999"))
df.withColumn("address",regexp_replace("address","PA"+df.st,"PA9999")
both seam to fail with 都失败了
TypeError: 'Column' object is not callable
could be similar to Pyspark replace strings in Spark dataframe column 可能类似于Spark Dataframe列中的Pyspark替换字符串
You might also use the spark udf. 您也可以使用spark udf。
The solution might be applied whenever you need to modify a data frame entry with a value from another column: 每当您需要使用另一列中的值修改数据框条目时,都可以应用该解决方案:
from pyspark.sql.functions import udf
from pyspark.sql.types import StringType
pd_input = pd.DataFrame({'address': ['2.PA1234.la','10.PA125.la','2.PA156.ln'],
'st':['1234','125','156']})
spark_df = sparkSession.createDataFrame(pd_input)
replace_udf = udf(lambda address, st: address.replace(st,'9999'), StringType())
spark_df.withColumn('adress_new',replace_udf(col('address'),col('st'))).show()
Output: 输出:
+-----------+----+------------+
| adress| st| adress_new|
+-----------+----+------------+
|2.PA1234.la|1234| 2.PA9999.la|
|10.PA125.la| 125|10.PA9999.la|
| 2.PA156.ln| 156| 2.PA9999.ln|
+-----------+----+------------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.