I am trying to learn pysaprk with sql functionalities or by dataframe group by solution itself.
Thanks.
df1:
Name Place Product
AA Germany pencil
AA Germany pen
AA Germany pen
BB Holland hat
BB Holland hat
BB Holland pen
CC USA laptop
CC USA laptop
CC USA charger
Expected output:
Name Place Product
AA Germany pencil, pen
BB Holland hat, pen
CC USA laptop, charger
您可以使用 collect_set 作为
df.groupBy("Name","Place").agg(concat_ws(",",collect_set("Product")))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.