简体   繁体   English

for 循环中的值未更新 Python

[英]Value not updated in for loop Python

I am testing the following simple example (see comments in the coding below for background).我正在测试以下简单示例(有关背景,请参阅下面编码中的注释)。 I have two questions.我有两个问题。 Thanks.谢谢。

  • How come b in bottle is not updated even though the for loop did calculate the right value?为什么即使 for 循环确实计算出正确的值, b in bottle也没有更新?
  • Is there an easier way to do this without using for loop?有没有不使用 for 循环的更简单的方法? I heard that using loop can take a lot of time to run when the data is bigger than this simple example.我听说当数据比这个简单的例子大时,使用循环可能会花费很多时间来运行。
 test = pd.DataFrame( [[1, 5], [1, 8], [1, 9], [2, 1], [3, 1], [4, 1]], columns=['a', 'b'] ) # Original df bottle = pd.DataFrame().reindex_like(test) # a blank df with the same shape bottle['a'] = test['a'] # set 'a' in bottle to be the same in test print(bottle) ab 0 1 NaN 1 1 NaN 2 1 NaN 3 2 NaN 4 3 NaN 5 4 NaN for index, row in bottle.iterrows(): row['b'] = test[test['a'] == row['a']]['b'].sum() print(row['a'], row['b']) 1.0 22.0 1.0 22.0 1.0 22.0 2.0 1.0 3.0 1.0 4.0 1.0 # I can see for loop is doing what I need. bottle ab 0 1 NaN 1 1 NaN 2 1 NaN 3 2 NaN 4 3 NaN 5 4 NaN # However, 'b' in bottle is not updated by the for loop. Why? And how to fix that? test['c'] = bottle['b'] # This is the end output I want to get, but not working due to the above. Also is there a way to achieve this without using for loop?

When you iterate over the dataframe's rows, your row variable will be a copy of the current row, local to that for-loop's iteration.当您遍历数据框的行时,您的row变量将是当前行的副本,对于该 for 循环的迭代是本地的。 When you go to the next iteration, that variable will be deleted, along with the changes you made to it.当您 go 进行下一次迭代时,该变量以及您对其所做的更改将被删除。 If you want your for loop to work, you should assign to bottle.loc[index, "b"] instead of to row["b"] .如果你想让你的 for 循环工作,你应该分配给bottle.loc[index, "b"]而不是row["b"]

You can complete your task without a for loop by using pandas.DataFrame.groupby and transform as follows:您可以通过使用pandas.DataFrame.groupby并按如下方式transform ,从而在没有 for 循环的情况下完成您的任务:

bottle["b"] = test.groupby("a")["b"].transform("sum")

bottle:瓶子:

   a   b
0  1  22
1  1  22
2  1  22
3  2   1
4  3   1
5  4   1

The value of b in bottle is not updated because you are not reassigning the value of b in bottle in the loop.瓶中 b 的值不会更新,因为您没有在循环中重新分配瓶中 b 的值。 Instead, you are only updating the value of b for the current row in the loop.相反,您只是更新循环中当前行的 b 值。

To fix this, you can modify the code as follows:要解决此问题,您可以按如下方式修改代码:

for index, row in bottle.iterrows():
    bottle.loc[index, 'b'] = test[test['a'] == row['a']]['b'].sum()

This will update the value of b in the bottle DataFrame for the current row in the loop.这将为循环中的当前行更新 bottle DataFrame 中的 b 值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM