I have 2 CSV files:
CSV 1 - original_names.csv
Serial,Names
1,James
2,Stephen
3,Ben
4,Harry
5,Jack
6, Peter
CSV 2 - dup_names.csv
Serial,Names
1,James
2,Kate
3,Ben
4,Sara
Desired Output - new.csv
Serial,Names,flag
1,0,T
2,Kate,F
3,0,T
4,Sara,F
5,Jack,F
6,Peter,F
As you can see, the same names in both CSV will be updated to 0 if names matches to new.csv.
This is what I've tried:
import pandas as pd
df1 = pd.read_csv('original_names.csv')
df2 = pd.read_csv('dup_names.csv')
out = df1.merge(df2['names'], how='inner', on = 'names')
# some code
out.to_csv("new.csv", index=False)
Thank you for your time:)
Do an outer join, then just add some logic here. If the 2 name columns match, put a 'T'
flag in, else put 'F'
. Then replace the 'names'
should be 0
is 'T'
, else the name in the second csv. If there is no name in the second csv, fill those with the name from the first csv.
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'serial':[1,2,3,4,5,6],
'names':['James','Stephen','Ben','Harry','Jack','Peter']})
df2 = pd.DataFrame({'serial':[1,2,3,4,],
'names':['James','Kate','Ben','Sara']})
out = df1.merge(df2, how='outer', on = ['serial'])
out['flag'] = np.where(out.names_x == out.names_y, 'T', 'F')
out['names'] = np.where(out.flag == 'T', 0, out.names_y)
out['names'] = out['names'].fillna(out.names_x)
out = out[['serial', 'names', 'flag']]
out.to_csv("new.csv", index=False)
Output:
print(out)
serial names flag
0 1 0 T
1 2 Kate F
2 3 0 T
3 4 Sara F
4 5 Jack F
5 6 Peter F
You could use:
import pandas as pd
import numpy as np
df1 = pd.read_csv('original_names.csv')
df2 = pd.read_csv('dup_names.csv')
out = df1.merge(df2, how='left', on = 'Serial')
out['Names'] = np.where(out['Names_x'] == out['Names_y'],
0, out['Names_y'])
out['Names'] = out['Names'].fillna(out['Names_x'])
out['flag'] = np.where(out['Names'] == 0, 'T', 'F')
out = out.drop(['Names_x', 'Names_y'], axis=1)
out.to_csv('new.csv', index=False)
Output:
serial names flag
0 1 0 T
1 2 Kate F
2 3 0 T
3 4 Sara F
4 5 Jack F
5 6 Peter F
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.