简体   繁体   中英

Trying to create a bag of words of Panda's df

I am new to pandas (and somewhat new to Python) and am trying to create a bag of words for every row of a specific column. This is where I took the code from and what follows is my attempt:

for index, row in df.iterrows():
    cell = df.Review2.iloc[index]
    df['BOW'].iloc[index] = pd.Series([y for x in cell for y in x.split()]).value_counts()

This is a single cell from my dataframe on which i'd like to perform the above operation (thus without the for loop for iterating on all the rows):

problem price say discount 6 bottle even show reduce check changesfive star taste goodthis get best cabinet ever great crisp get best cabinet ever great crisp originally buy three bottle wind buy whole case holidaysnice california cab cab fantastic pleasantly surprise great fullbodied flavor 1 cent ship promotion decent

Any help is greatly appreciated!

import pandas as pd
from collections import Counter
df = pd.DataFrame({'review': ['Hello World Hello', 'Hi Bye Bye Bye']})
df['BOW'] = df.review.apply(lambda x: Counter(x.split(" ")))


              review                         BOW
0  Hello World Hello  {u'World': 1, u'Hello': 2}
1     Hi Bye Bye Bye       {u'Bye': 3, u'Hi': 1}

I used pandas apply method to process all the rows without iterating them explicitly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM