简体   繁体   中英

How to output column values from pyspark dataframe into string?

I'm working with a dataset and want to create a textblob of all values of a particular column called 'text'. I tried the following methods:

xp = positive.select("text").collect().map(_(0)).toList
#positive is the dataframes name, 'text' is the column name
xp = " ".join(positive['text])

None of these methods have worked for me thus far and return the error

'list' object has no attribute 'map'
Traceback (most recent call last):
AttributeError: 'list' object has no attribute 'map'

You seem to be using Scala syntax. The list returned by collect contains Row objects; you can simply access the text attribute of each Row using a generator expression , rather than using map (which is not a method of list , in any case):

' '.join(row.text for row in positive.select('text').collect())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM