类型错误：“系列”对象是可变的，因此它们不能被散列

Question

I know this error is common, I tried some solutions I looked up and still can't understand what is wrong.我知道这个错误很常见，我尝试了一些我查过的解决方案，但仍然不明白出了什么问题。 I guess it is due to the mutable form of row and row1, but i can't figure it out我想这是由于 row 和 row1 的可变形式，但我想不通

What am I trying to do ?我想做什么？ I have 2 dataframes.我有 2 个数据框。 I need to iterate over the rows of the first 1, and for each line of the first one iterate through the second and check the value of the cell for some columns.我需要遍历第一个 1 的行，并且对于第一个的每一行遍历第二个并检查某些列的单元格的值。 My code and different attempts :我的代码和不同的尝试：

a=0
b=0
  for row in Correction.iterrows():
        b+=1
        for row1 in dataframe.iterrows():
            c+=1
            a=0
            print('Handling correction '+str(b)+' and deal '+str(c))
            if (Correction.loc[row,['BO Branch Code']]==dataframe.loc[row1,['wings Branch']] and Correction.loc[row,['Profit Center']]==dataframe.loc[row1,['Profit Center']] and Correction.loc[row,['Back Office']]==dataframe.loc[row1,['Back Office']]
                and Correction.loc[row,['BO System Code']]==dataframe.loc[row1,['BO System Code']]):

I also tried我也试过

a=0
b=0
 for row in Correction.iterrows():
        b+=1
        for row1 in dataframe.iterrows():
            c+=1
            a=0
            print('Handling correction '+str(b)+' and deal '+str(c))
            if (Correction[row]['BO Branch Code']==dataframe[row1]['wings Branch'] and Correction[row]['Profit Center']==dataframe[row1]['Profit Center'] and Correction[row]['Back Office']==dataframe[row1]['Back Office']
                and Correction[row]['BO System Code']==dataframe[row1]['BO System Code']):

And和

a=0
b=0
 for row in Correction.iterrows():
        b+=1
        for row1 in dataframe.iterrows():
            c+=1
            a=0
            print('Handling correction '+str(b)+' and deal '+str(c))
            if (Correction.loc[row,['BO Branch Code']]==dataframe[row1,['wings Branch']] and Correction[row,['Profit Center']]==dataframe[row1,['Profit Center']] and Correction[row,['Back Office']]==dataframe[row1,['Back Office']]
                and Correction[row,['BO System Code']]==dataframe[row1,['BO System Code']]):

Answer 1

I found a way around by changing my for loop now my code is :我通过更改 for 循环找到了解决方法，现在我的代码是：

a=0
b=0
 for index in Correction.index:
        b+=1
        for index1 in dataframe.index:
            c+=1
            a=0
            print('Handling correction '+str(b)+' and deal '+str(c))
            if (Correction.loc[row,'BO Branch Code']==dataframe.loc[row1,'Wings Branch]] and Correction.loc[row,'Profit Center']==dataframe.loc[row1,'Profit Center'] and Correction.loc[row,'Back Office']==dataframe.loc[row1,'Back Office']
                and Correction.loc[row,'BO System Code']==dataframe.loc[row1,'BO System Code']):

Answer 2

I think you are iterating your df wrong我认为你在迭代你的 df 错误

for row in Correction.itertuples():
    bo_branch_code = row['BO Branch Code']
    for row1 in dataframe.itertuples():
        if row1['wings Branch'] == bo_branch_code:
            # do stuff here

reference how to iterate DataFrame: https://github.com/vi3k6i5/pandas_basics/blob/master/2.A%20Iterate%20over%20a%20dataframe.ipynb参考如何迭代 DataFrame： https : //github.com/vi3k6i5/pandas_basics/blob/master/2.A%20Iterate%20over%20a%20dataframe.ipynb

I timed your index approach and iteraterows approach.我为您的索引方法和 iteraterows 方法计时。 Here are the results:结果如下：

import pandas as pd
import numpy as np
import time

df = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))

df_2 = pd.DataFrame(np.random.randint(0,100,size=(10, 4)), columns=list('ABCD'))

def test_time():
    for index in df.index:
        for index1 in df_2.index:
            if (df.loc[index, 'A'] == df_2.loc[index1, 'A']):
                continue

def test_time_2():
    for idx, row in df.iterrows():
        a_val = row['A']
        for idy, row_1 in df_2.iterrows():
            if (a_val == row_1['A']):
                continue

start= time.clock()
test_time()
end= time.clock()
print(end-start)
# 0.038514999999999855

start= time.clock()
test_time_2()
end= time.clock()
print(end-start)
# 0.009272000000000169

Simply saying iterrows is way faster than your approach.简单地说 iterrows 比你的方法快得多。

Reference on good approaches to loop over a dataframe What is the most efficient way to loop through dataframes with pandas?关于循环数据帧的好方法的参考使用 Pandas 循环数据帧的最有效方法是什么？

类型错误：“系列”对象是可变的，因此它们不能被散列

问题描述

2 个解决方案

解决方案1
0 2017-02-28 09:43:01

解决方案2
0 2017-02-28 09:46:10

类型错误：“系列”对象是可变的，因此它们不能被散列

问题描述

2 个解决方案

解决方案1 0 2017-02-28 09:43:01

解决方案2 0 2017-02-28 09:46:10

解决方案1
0 2017-02-28 09:43:01

解决方案2
0 2017-02-28 09:46:10