简体   繁体   中英

How to loop dataframe column in databricks using pyspark

我想将一个句子词形化,如下所示。它对单个句子工作正常,如图所示

参考这张图片,我想对字符串的整个数据框列进行词形还原,但它会引发错误

i want to apply lemmatization for dataframe column using pyspark running in databricks.Refer the images for error.

Import the function lemmatize_sentence() in side the function and then create the UDF that should work. you are getting this error because the import is on driver node not on the entire cluster. when you import this inside the function so in that case while creating the UDF it will send the copy of function to all the nodes in the cluster.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM