简体   繁体   English

基于条件更改行值的 Python for 循环工作正常,但不会更改 Pandas 数据帧上的值?

[英]Python for-loop to change row value based on a condition works correctly but does not change the values on pandas dataframe?

I am just getting into Python, and I am trying to make a for-loop that loops on every row and randomly select two columns on each iteration based on a given condition and change their values.我刚刚开始使用 Python,我正在尝试创建一个for-loop ,该循环在每一行上for-loop ,并根据给定条件在每次迭代中随机选择两列并更改它们的值。 The for-loop works without any problems; for-loop没有任何问题; however, the results don't change on the dataframe .但是, dataframe上的结果不会改变。

A reproducible example:一个可重现的例子:

df= pd.DataFrame({'A': [10,40,10,20,10],
                  'B': [10,10,50,40,50],
                  'C': [10,20,10,10,10],
                  'D': [10,30,10,10,50],
                  'E': [10,10,40,10,10],
                  'F': [2,3,2,2,3]})

df:


    A   B   C   D   E   F
0   10  10  10  10  10  2
1   40  10  20  30  10  3
2   10  50  10  10  40  2
3   20  40  10  10  10  2
4   10  50  10  50  10  3

This is my for-loop ;这是我的for-loop the for loop iterates on all rows and check if the value on column F = 2; for 循环遍历所有行并检查列 F 上的值是否为 2; it randomly selects two columns with value 10 and change them to 100.它随机选择两列值为 10 并将它们更改为 100。

for index, i in df.iterrows():
  if i['F'] == 2:
    i[i==10].sample(2, axis=0)+100
    print(i[i==10].sample(2, axis=0)+100)

This is the output of the loop:这是循环的输出:

E    110
C    110
Name: 0, dtype: int64
C    110
D    110
Name: 2, dtype: int64
C    110
D    110
Name: 3, dtype: int64

This is what the dataframe is expected to look like:这是dataframe的预期样子:

df:


    A   B   C   D   E   F
0   10  10  110 10  110 2
1   40  10  20  30  10  3
2   10  50  110 110 40  2
3   20  40  110 110 10  2
4   10  50  10  50  10  3

However, the columns on the dataframe are not change.但是, dataframe的列不会更改。 Any idea what's going wrong?知道出了什么问题吗?

This line:这一行:

i[i==10].sample(2, axis=0)+100

.sample returns a new dataframe so the original dataframe ( df ) was not updated at all. .sample返回一个新的数据帧,因此原始数据帧 ( df ) 根本没有更新。

Try this:尝试这个:

for index, i in df.iterrows():
    if i['F'] == 2:
        cond = (i == 10)

        # You can only sample 2 rows if there are at
        # least 2 rows meeting the condition
        if cond.sum() >= 2:
            idx = i[cond].sample(2).index
            i[idx] += 100
            print(i[idx])

You should not modify the original df in place .不应就地修改原始 df Make a copy and iterate:复制并迭代:

df2 = df.copy()
for index, i in df.iterrows():
    if i['F'] == 2:
        s = i[i==10].sample(2, axis=0)+100
        df2.loc[index,i.index.isin(s.index)] = s

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM