熊貓str.count（）

Question

我有一個包含2列的數據框，我正在嘗試創建第三列，計算第二列中第一列的出現次數。

sample_df =

Object  Text
Banana  Banana Banana Banana
Banana  Apple Apple Apple
Apple   Banana Apple

現在我正在嘗試以下代碼：

sample_df['Mentions'] = sample_df['Text'].count(sample_df['Object'])

產生以下錯誤：

AttributeErrorTraceback (most recent call last)
<ipython-input-65-c9ae4ce28088> in <module>()
----> 1 sample_df['Mentions'] = sample_df['Text'].count(sample_df['Object'])

/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in count(self, 
level)
1177             level = self.index._get_level_number(level)
1178 
-> 1179         lev = self.index.levels[level]
1180         lab = np.array(self.index.labels[level], subok=False, copy=True)
1181 

AttributeError: 'RangeIndex' object has no attribute 'levels'

Answer 1

如果您閱讀pd.Series.count的文檔，您將看到它沒有按照您的想法執行：

Series.count(level=None)

返回系列中非NA / null觀測值的返回數

您已經提供了一個pandas Series作為級別參數，這是無效的，這就是您收到錯誤的原因。 為了您的使用，請嘗試以下方法：

df['counter'] = df.apply(lambda x: x.Text.count(x.Object), axis=1)

   Object                  Text  counter
0  Banana  Banana Banana Banana        3
1  Banana     Apple Apple Apple        0
2   Apple          Banana Apple        1

如果你關心性能，你也可以在這里使用一個簡單的列表理解：

df['counter'] = [i.count(j) for i, j in zip(df.Text, df.Object)]

計時（使用列表理解：D）

df = pd.concat([df]*10000)

%timeit df.apply(lambda x: x.Text.count(x.Object), axis=1)
1.14 s ± 14.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit [i.count(j) for i, j in zip(df.Text, df.Object)]
6.71 ms ± 25 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Answer 2

from collections import Counter
def count(row):
    temp=row[1].split(' ')
    d=Counter(temp)
    return d[row[0]]
df['Mentions']=df.apply(lambda x: count(x),axis=1)
print(df)

    Object      Text                  Mentions
0   Banana  Banana Banana Banana    3
1   Banana  Apple Apple Apple       0
2   Apple   Banana Apple            1

熊貓str.count（）

問題描述

2 個解決方案

解決方案1
3 2018-07-31 17:47:15

解決方案2
0 2018-07-31 17:51:55

熊貓str.count（）

問題描述

2 個解決方案

解決方案1 3 2018-07-31 17:47:15

解決方案2 0 2018-07-31 17:51:55

解決方案1
3 2018-07-31 17:47:15

解決方案2
0 2018-07-31 17:51:55