循环通过 pandas 数据帧以使用 for 循环替换现有值

Question

Problem: I am trying to loop through a dataframe, row by row, by using a for loop.问题：我正在尝试使用 for 循环逐行遍历 dataframe。 But its not working as desired.但它没有按预期工作。 I know there are iterrows() and itertuple() by I want to experiment with for loop.我知道有 iterrows() 和 itertuple() 我想试验 for 循环。

Can you tell me where this is going wrong?你能告诉我这是哪里出错了吗？

sample data样本数据

data3 = {"one":['101', '102', '103' , '104'],
     "two":['101', '105', '106', '104'],
     "three": ['102', '5', '107', '108'],
     "other": ['101', '102', '103' , '104']
     }
df3 = pd.DataFrame(data3)

Goal: check column 'two' by each row, and if a value of column 'two' exists in column 'one' then create a new column 'new_col' with the value 'del'.目标：检查每一行的“二”列，如果“一”列中存在“二”列的值，则创建一个值为“del”的新列“new_col”。 If the value doesnt exist in column 'one' then create the 'new_col' as 'keep'.如果“一”列中不存在该值，则将“new_col”创建为“保留”。 For example, if column 'two' has 101, i want to compare it with all the values of column 'one'例如，如果“二”列有 101，我想将它与“一”列的所有值进行比较

my code:我的代码：

dfToList1 = df3['two'].tolist()
for x in dfToList1:
   if x in df3['one'].values:
       df3['new_col'] = 'del'
   else:
       df3['new_col'] = 'keep'

then I can replace the value in 'two' which matches with 'one' with a string like 'none'然后我可以用类似'none'的字符串替换'two'中与'one'匹配的值

df3.loc[df3['new_col'] == 'del', 'two'] = 'none'

my output:我的 output：

Ideally in 2nd and 3rd row, 5 and 107 in 'two' doesn't not include in 'one' and therefore new_col in 2nd and 3rd row should have the value 'keep' but I am not getting it.理想情况下，在第 2 行和第 3 行，'two' 中的 5 和 107 不包含在'one' 中，因此第 2 行和第 3 行中的 new_col 应该具有值'keep'，但我没有得到它。

    one other   three   two new_col
0   101 101     102     101     del
1   102 102       5     105     del
2   103 103     107     106     del
3   104 104     108     104     del

expected output预计 output

    one other   three   two  new_col
0   101 101     102     101     del
1   102 102       5     105     keep
2   103 103     107     106     keep
3   104 104     108     104     del

Answer 1

Use np.where :使用np.where ：

df3['new_col'] = np.where(df3['two'].isin(df3['one']), 'del', 'keep')

Result:结果：

   one  two three new_col
0  101  101   102     del
1  102  105     5    keep
2  103  106   107    keep
3  104  104   108     del

Answer 2

Use np.where with Series.eq and Series.isin to check.使用np.where与Series.eq和Series.isin进行检查。

df3['newcol']=np.where(~df3.two.isin(df3.one),'keep','del')

or to select by columns 'one' with any common value with column two:或 select 通过列“一”与第二列的任何共同值：

df3['newcol']=np.where(~df3.one.isin(df3.loc[df3.two.eq(df3.one),'two']),'keep','del')
print(df3)

   one  two three other newcol
0  101  101   102   101    del
1  102  105     5   102   keep
2  103  106   107   103   keep
3  104  104   108   104    del

Details细节

two_coincident_one=df3.loc[df3.two.eq(df3.one),'two']
print(two_coincident_one)
0    101
3    104
Name: two, dtype: object


~df3.one.isin(two_coincident_one)

0    False
1     True
2     True
3    False
Name: one, dtype: bool

循环通过 pandas 数据帧以使用 for 循环替换现有值

问题描述

2 个解决方案

解决方案1
0 2019-11-04 19:49:15

解决方案2
0 已采纳 2019-11-04 19:55:41

循环通过 pandas 数据帧以使用 for 循环替换现有值

问题描述

2 个解决方案

解决方案1 0 2019-11-04 19:49:15

解决方案2 0 已采纳 2019-11-04 19:55:41

解决方案1
0 2019-11-04 19:49:15

解决方案2
0 已采纳 2019-11-04 19:55:41