I have the following data frame:
df1:
Name Tis Exr Name_2 Exr_2
A1FH derm 3.4 GHJK brn:2.4
N4RT lng 0.1 PP2DS Lvr:3.4;hup:2.3
GHJK Pap 2.2 KLM3 tet:2.0
4HHR stm 1.4 LSDR NaN
PP2DS skl 3.7 PMRT van:3.7;epth:23.5
LSDR lym 2.1 exty NaN
2BC4 lym 4.4 NaN NaN
Essentially columns "Tis" and "Exr" refer to column "Name", while column "Exr_2" refers to column "Name_2".
I am trying to sort the dataframe where if a row within column "Name" matches a row within column "Name_2" then they are moved onto the same row - and so is the data within the columns above. rows which don't match are kept but listed as NaN in the non-matching row. I'm looking to do this in alphabetical order.
Desired output:
df2:
Name Tis Exr Name_2 Exr_2
GHJK Pap 2.2 GHJK brn:2.4
LSDR lym 2.1 LSDR NaN
PP2DS skl 3.7 PP2DS Lvr:3.4;hup:2.3
2BC4 lym 4.4 NaN NaN
4HHR stm 1.4 NaN NaN
A1FH derm 3.4 NaN NaN
NaN NaN NaN exty NaN
NaN NaN NaN KLM3 tet:2.0
N4RT lng 0.1 NaN NaN
NaN NaN NaN PMRT van:3.7;epth:23.5
I have tried a number of different things:
df1 = pd.read_csv('dataset.csv', error_bad_lines=False, sep = '\t')
df2 = df1.sort_values(['Name', 'Name_2'], ascending =[False, True])
tried:
df1[df1.Name==df1.Name_2]
I have also tried using various tools on Linux command line but using Pandas seems better since I am more familiar with Python.
The dataframe I have is over 41,000 lines.
You can split the data into two separate dataframes and use df.merge
to match the names.
df2 = df1[['Name', 'Tis', 'Exr']].sort_values('Name')
df_temp = df1[['Name_2', 'Exr_2']]
df2 = df2.merge(df_temp, left_on='Name', right_on='Name_2', how='outer')
del df_temp
print(df2)
Output
Name Tis Exr Name_2 Exr_2
0 2BC4 lym 4.4 NaN NaN
1 4HHR stm 1.4 NaN NaN
2 A1FH derm 3.4 NaN NaN
3 GHJK Pap 2.2 GHJK brn:2.4
4 LSDR lym 2.1 LSDR NaN
5 N4RT lng 0.1 NaN NaN
6 PP2DS skl 3.7 PP2DS Lvr:3.4;hup:2.3
7 NaN NaN NaN KLM3 tet:2.0
8 NaN NaN NaN PMRT van:3.7;epth:23.5
9 NaN NaN NaN exty NaN
10 NaN NaN NaN NaN NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.