[英]Replace values of one column for each value of another column in pandas
I have a csv file shown below: 我有一个如下所示的csv文件:
ID,Number,Value
61745,three,11
61745,one,11
61745,one & two,12
61745,two,13
61743,one,41
61743,two,42
61741,one,21
61741,one & two,22
61715,one,31
61715,two,32
61715,three,33
What I am trying to achieve: 我想要达到的目标:
For each ID , if the Number column contains "one & two", I want all the Number column fields that contain "two" or "one" to be replaced with the "one & two" value. 对于每个ID ,如果“ 数字”列包含“一和二”,我希望将所有包含“二”或“一”的“ 数字”列字段替换为“一&二”值。 For example, for the "61745" ID I can see the "one & two" value at least once. 例如,对于“ 61745” ID,我可以至少看到一次“ 1&2”值。 But for the "61743" ID I cannot see this value. 但是对于“ 61743” ID,我看不到该值。 So, I want to return the following: 因此,我想返回以下内容:
ID,Number,Value
61745,three,11
61745,one & two,11
61745,one & two,12
61745,one & two,13
61743,one,41
61743,two,42
61741,one & two,21
61741,one & two,22
61715,one,31
61715,two,32
61715,three,33
So far, I have tried this: 到目前为止,我已经尝试过了:
import pandas as pd
import os
import csv
import time
import dateutil.parser as dparser
import datetime
df = pd.read_csv("slack.csv")
for row in df.itertuples():
if row[2] == "one & two":
df.ix[df.Number.isin(['one & two','one','two']), 'Number'] = 'one & two'
and the result is that the script replaces all the "two" and "one" values in the Number column for every ID : 结果是该脚本为每个ID替换了Number列中的所有“两个”和“一个”值:
ID Number Value
0 61745 three 11
1 61745 one & two 11
2 61745 one & two 12
3 61745 one & two 13
4 61743 one & two 41
5 61743 one & two 42
6 61741 one & two 21
7 61741 one & two 22
8 61715 one & two 31
9 61715 one & two 32
10 61715 pinterest 33
Use custom function with groupby
with check if at least one value is one & two
and then replace
by dict
: 将自定义函数与groupby
一起使用,并检查至少一个值是否为one & two
,然后replace
dict
replace
:
def f(x):
d = {'one':'one & two', 'two':'one & two'}
if x.eq('one & two').any():
return x.replace(d)
else:
return x
df['Number'] = df.groupby('ID')['Number'].apply(f)
print (df)
ID Number Value
0 61745 three 11
1 61745 one & two 11
2 61745 one & two 12
3 61745 one & two 13
4 61743 one 41
5 61743 two 42
6 61741 one & two 21
7 61741 one & two 22
8 61715 one 31
9 61715 two 32
10 61715 three 33
Replace this line: 替换此行:
df.ix[df.Number.isin(['one & two','one','two']), 'Number'] = 'one & two'
with the following: 具有以下内容:
ids = df.ID[df.Number == 'one & two'].unique()
df.loc[df.ID.isin(ids) & df.Number.isin(['one', 'two']), 'Number'] = 'one & two'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.