[英]Python for-loop to change row value based on a condition works correctly but does not change the values on pandas dataframe?
I am just getting into Python, and I am trying to make a for-loop
that loops on every row and randomly select two columns on each iteration based on a given condition and change their values.我刚刚开始使用 Python,我正在尝试创建一个
for-loop
,该循环在每一行上for-loop
,并根据给定条件在每次迭代中随机选择两列并更改它们的值。 The for-loop
works without any problems; for-loop
没有任何问题; however, the results don't change on the dataframe
.但是,
dataframe
上的结果不会改变。
A reproducible example:一个可重现的例子:
df= pd.DataFrame({'A': [10,40,10,20,10],
'B': [10,10,50,40,50],
'C': [10,20,10,10,10],
'D': [10,30,10,10,50],
'E': [10,10,40,10,10],
'F': [2,3,2,2,3]})
df:
A B C D E F
0 10 10 10 10 10 2
1 40 10 20 30 10 3
2 10 50 10 10 40 2
3 20 40 10 10 10 2
4 10 50 10 50 10 3
This is my for-loop
;这是我的
for-loop
; the for loop iterates on all rows and check if the value on column F = 2; for 循环遍历所有行并检查列 F 上的值是否为 2; it randomly selects two columns with value 10 and change them to 100.
它随机选择两列值为 10 并将它们更改为 100。
for index, i in df.iterrows():
if i['F'] == 2:
i[i==10].sample(2, axis=0)+100
print(i[i==10].sample(2, axis=0)+100)
This is the output of the loop:这是循环的输出:
E 110
C 110
Name: 0, dtype: int64
C 110
D 110
Name: 2, dtype: int64
C 110
D 110
Name: 3, dtype: int64
This is what the dataframe
is expected to look like:这是
dataframe
的预期样子:
df:
A B C D E F
0 10 10 110 10 110 2
1 40 10 20 30 10 3
2 10 50 110 110 40 2
3 20 40 110 110 10 2
4 10 50 10 50 10 3
However, the columns on the dataframe
are not change.但是,
dataframe
的列不会更改。 Any idea what's going wrong?知道出了什么问题吗?
This line:这一行:
i[i==10].sample(2, axis=0)+100
.sample
returns a new dataframe so the original dataframe ( df
) was not updated at all. .sample
返回一个新的数据帧,因此原始数据帧 ( df
) 根本没有更新。
Try this:尝试这个:
for index, i in df.iterrows():
if i['F'] == 2:
cond = (i == 10)
# You can only sample 2 rows if there are at
# least 2 rows meeting the condition
if cond.sum() >= 2:
idx = i[cond].sample(2).index
i[idx] += 100
print(i[idx])
You should not modify the original df in place .您不应就地修改原始 df 。 Make a copy and iterate:
复制并迭代:
df2 = df.copy()
for index, i in df.iterrows():
if i['F'] == 2:
s = i[i==10].sample(2, axis=0)+100
df2.loc[index,i.index.isin(s.index)] = s
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.