遍歷行和列以在Pandas中添加計數

Question

我正在嘗試遍歷Pandas中的列和行以交叉引用我擁有的列表並計算共現次數。

我的數據框看起來像：

+-------+-----+-----+----+----+-------+-------+------+
| Lemma | Dog | Cat | Sg | Pl |  Good |  Okay |  Bad |
+-------+-----+-----+----+----+-------+-------+------+
| Dog   |   0 |   0 |  0 |  0 |   0   |   0   |  0   |
| Cat   |   0 |   0 |  0 |  0 |   0   |   0   |  0   |
+-------+-----+-----+----+----+-------+-------+------+

我有一個類似的清單：

c=[[dog, Sg, Good], [cat, Pl, Okay], [dog, Pl, Bad]

我想遍歷Lemma每個項目，在c找到它，然后為該列表項目查找任何列名。 如果看到這些列名，我將添加+1。 如果引理項出現在彼此的3個單詞窗口中，我還想添加一個計數。

我已經嘗試過類似以下操作（忽略單詞窗口問題）：

for idx, row in df.iterrows():
    for columns in df:
        for i in c:
            if i[0]==row:
                if columns in c[1]:
                    df.ix['columns','row'] +=1

但是我得到了一個錯誤：“ ValueError：系列的真值不明確。請使用a.empty，a.bool（），a.item（），a.any（）或a.all（）。”

我的理想結果如下：

+-------+-----+-----+----+----+-------+-------+------+
| Lemma | Dog | Cat | Sg | Pl |  Good |  Okay |  Bad |
+-------+-----+-----+----+----+-------+-------+------+
| Dog   |   1 |   1 |  1 |  1 |   1   |   0   |  1   |
| Cat   |   2 |   0 |  0 |  1 |   0   |   1   |  0   |
+-------+-----+-----+----+----+-------+-------+------+

謝謝！

Answer 1

您有幾件事需要更改。

1）您的列表可能需要用Dog代替dog ，用Cat代替cat

2）您可能想要： for column in df.columns中for columns in df而不是for columns in df中for columns in df

3）您可能想要： if i[0] == row['Lemma']而不是if i[0]==row:這是中斷的地方

4）您可能想要if column in i的if columns in c[1]而不是if columns in c[1]的if columns in c[1]

5）您可能希望df.ix[idx, column] += 1而不是df.ix['columns','row'] +=1

Answer 2

問題中顯示的理想結果不准確。 dog欄里絕對不能有cat ，反之亦然。
我不會通過重復DataFrame ，我解開list中lists到dict那么加載dict入DataFrame ，如下圖所示。

碼：

import pandas as pd

c=[['dog', 'Sg', 'Good'], ['cat', 'Pl', 'Okay'], ['dog', 'Pl', 'Bad'],
   ['dog', 'Sg', 'Good'], ['cat', 'Pl', 'Okay'], ['dog', 'Pl', 'Okay'],
   ['dog', 'Sg', 'Good'], ['cat', 'Sg', 'Good'], ['dog', 'Pl', 'Bad'],
   ['dog', 'Sg', 'Good'],['cat', 'Pl', 'Okay'], ['dog', 'Pl', 'Bad']]

Lemma = {'dog': {'dog': 0, 'Sg': 0, 'Pl': 0, 'Good': 0, 'Okay': 0, 'Bad': 0},
         'cat': {'cat': 0, 'Sg': 0, 'Pl': 0, 'Good': 0, 'Okay': 0, 'Bad': 0}}

注意： c list每個值都是Lemma的key 。 參考python字典。 例如，當x = ['dog', 'Sg', 'Good'] ， Lemma[x[0]][x[2]]與Lemma['dog']['Good'] 。 Lemma['dog']['Good']的初始值= 0，因此Lemma['dog']['Good'] = 0 + 1，然后下一次將是1 + 1，依此類推。

for x in c:
    Lemma[x[0]][x[0]] = Lemma[x[0]][x[0]] + 1
    Lemma[x[0]][x[1]] = Lemma[x[0]][x[1]] + 1
    Lemma[x[0]][x[2]] = Lemma[x[0]][x[2]] + 1

df = pd.DataFrame.from_dict(Lemma, orient='index')

輸出：

情節

df.plot(kind='bar', figsize=(6, 6))

以編程方式創建`dict` ：

創建`sets`的話`dict` `keys`從`list`中`lists` ：

outer_keys = set()
inner_keys = set()
for x in c:
    outer_keys.add(x[0])  # first word is outer key
    inner_keys |= set(x[1:])  # all other words

創建`dict`的`dicts` ：

Lemma = {j: dict.fromkeys(inner_keys | {j}, 0) for j in outer_keys}

最后的`dict` ：

{'dog': {'Okay': 0, 'Pl': 0, 'Good': 0, 'Bad': 0, 'Sg': 0, 'dog': 0},
 'cat': {'Okay': 0, 'Pl': 0, 'Good': 0, 'Bad': 0, 'Sg': 0, 'cat': 0}}

遍歷行和列以在Pandas中添加計數

問題描述

2 個解決方案

解決方案1
0 2019-08-06 17:13:03

解決方案2
0 已采納 2019-08-06 17:27:19

碼：

輸出：

情節

以編程方式創建`dict` ：

創建`sets`的話`dict` `keys`從`list`中`lists` ：

創建`dict`的`dicts` ：

最后的`dict` ：

遍歷行和列以在Pandas中添加計數

問題描述

2 個解決方案

解決方案1 0 2019-08-06 17:13:03

解決方案2 0 已采納 2019-08-06 17:27:19

碼：

輸出：

情節

以編程方式創建dict ：

創建sets的話dict keys從list中lists ：

創建dict的dicts ：

最后的dict ：

解決方案1
0 2019-08-06 17:13:03

解決方案2
0 已采納 2019-08-06 17:27:19

以編程方式創建`dict` ：

創建`sets`的話`dict` `keys`從`list`中`lists` ：

創建`dict`的`dicts` ：

最后的`dict` ：