[英]Is there a way to iterate through list of string in a dataframe?
I wrote the following code.我写了以下代码。 I want to replace the number "1" with "0" whenever it appear twice or more for a particular universal_id and the number "1" that is left should be in the row where days are the lowest.
我想将数字“1”替换为“0”,只要它针对特定的universal_id出现两次或更多,剩下的数字“1”应该在天数最低的行中。 The below code does the work but I want to iterate over more then one universal_id.
下面的代码可以完成工作,但我想迭代一个以上的universal_id。 Column "e" is ok for 'efra" I want this to do for other ID's and other columns.
“e”列适用于“efra”,我希望其他 ID 和其他列也可以这样做。
pdf1 = pd.DataFrame(
[[1, 0,1, 0,1, 60, 'fdaf'],
[1, 1,0, 0,1, 350, 'fdaf'],
[1, 1,0, 0,1, 420, 'erfa'],
[0, 1,0, 0,1, 410, 'erfa']],
columns=['A', 'B', 'c', 'd', 'e', 'days','universal_id'])
pdf1['A'] = np.where(pdf1['days']==pdf1['days'].min(),1,0)
zet = pdf1.loc[pdf1['e'].isin([1]) &
pdf1['universal_id'].str.contains('erfa')]
zet['e'] = np.where(zet['days']==zet['days'].min(),1,0)
pdf1.loc[zet.index, :] = zet[:]
pdf1
Output:输出:
A B c d e days universal_id
0 1 0 1 0 1 60 fdaf
1 0 1 0 0 1 350 fdaf
2 0 1 0 0 0 420 erfa
3 0 1 0 0 1 410 erfa
You can use:您可以使用:
df2 = pdf1.sort_values(by='days')
m1 = df2['A'].eq(1)
m2 = df2[['A', 'universal_id']].duplicated()
pdf1.loc[m1&m2, 'A'] = 0
output:输出:
A B c d e days universal_id
0 1 0 1 0 1 60 fdaf
1 0 1 0 0 1 350 fdaf
2 1 1 0 0 1 420 erfa
3 0 1 0 0 1 410 erfa
for e, f you want to follow the same logic:对于 e, f 你要遵循相同的逻辑:
m1 = df2['A'].eq(1)
m3 = df2[['e', 'universal_id']].duplicated()
pdf1.loc[m1&m3, 'e'] = 0
output:输出:
A B c d e days universal_id
0 1 0 1 0 1 60 fdaf
1 0 1 0 0 0 350 fdaf
2 1 1 0 0 0 420 erfa
3 0 1 0 0 1 410 erfa
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.