简体   繁体   English

试图创建一袋熊猫df的单词

[英]Trying to create a bag of words of Panda's df

I am new to pandas (and somewhat new to Python) and am trying to create a bag of words for every row of a specific column. 我是熊猫新手(对Python还是有些新手),并且正在尝试为特定列的每一行创建一袋单词。 This is where I took the code from and what follows is my attempt: 是我从中获取代码的地方,下面是我的尝试:

for index, row in df.iterrows():
    cell = df.Review2.iloc[index]
    df['BOW'].iloc[index] = pd.Series([y for x in cell for y in x.split()]).value_counts()

This is a single cell from my dataframe on which i'd like to perform the above operation (thus without the for loop for iterating on all the rows): 这是我要在其上执行上述操作的数据帧中的单个单元(因此,没有for循环可在所有行上进行迭代):

problem price say discount 6 bottle even show reduce check changesfive star taste goodthis get best cabinet ever great crisp get best cabinet ever great crisp originally buy three bottle wind buy whole case holidaysnice california cab cab fantastic pleasantly surprise great fullbodied flavor 1 cent ship promotion decent 问题价格,说折扣6瓶甚至显示减少检查更改五星级的味道好这获得最好的橱柜曾经伟大的酥脆得到最好的橱柜曾经伟大的酥脆本来买三瓶风买整个案子假日尼斯加州出租车驾驶室梦幻般令人惊喜的美味浓郁的味道1分的船促销不错

Any help is greatly appreciated! 任何帮助是极大的赞赏!

import pandas as pd
from collections import Counter
df = pd.DataFrame({'review': ['Hello World Hello', 'Hi Bye Bye Bye']})
df['BOW'] = df.review.apply(lambda x: Counter(x.split(" ")))


              review                         BOW
0  Hello World Hello  {u'World': 1, u'Hello': 2}
1     Hi Bye Bye Bye       {u'Bye': 3, u'Hi': 1}

I used pandas apply method to process all the rows without iterating them explicitly. 我使用了pandas apply方法来处理所有行,而无需显式地对其进行迭代。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM