简体   繁体   中英

Faster pandas apply using modin.pandas

Trying to use all cores for this apply function using modin.pandas

from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()
# sentiment Score of essay
data = data.merge(data.essay.apply(lambda s: pd.Series({'neg':sid.polarity_scores(s)['neg'], 
                                                 'neu':sid.polarity_scores(s)['neu'],
                                                 'pos':sid.polarity_scores(s)['pos'],
                                                 'compound':sid.polarity_scores(s)['compound']})), 
           left_index=True, right_index=True)

It works with default pandas, but using modin raises this error:

ValueError: can not merge DataFrame with instance of type <class 'modin.pandas.series.Series'>

essay is text column in the DataFrame named "data"

As the answers to this question suggest, you are likely getting this error because you are merging a pandas.Dataframe with a modin.Series . For your example, try converting data to a modin dataframe with modin.pandas.DataFrame(data) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM