简体   繁体   中英

How to join or merge in python

We have two Dataframes. We want all the columns form Dataframe1 , but just one column ( target_name ) from the Dataframe2 . However, when we do this, it gives us duplicate values.

Dataframe1 values :

    user_id subject_id  x               y           w           h       g
0   858580  23224814    58.133331   57.466675   181.000000  42.000000   1
1   858580  23224814    293.133331  176.466675  80.000000   34.000000   2
2   313344  28539152    834.049316  37.493195   63.005920   36.444595   1
3   313344  28539152    104.003235  45.072937   242.956024  26.754082   2
4   313344  28539152    635.436829  80.038574   108.716065  35.240089   3
5   313344  28539152    351.910156  80.162117   201.371887  32.738373   4
6   861687  28539165    125.313393  39.836521   231.202873  43.087811   1
7   861687  28539165    623.450500  44.040207   151.332825  34.680435   2
8   1254304 28539165    128.893204  45.765110   225.686691  35.547726   1

Dataframe2 Values :

    Unnamed: 0  user_id subject_id  good    x   y   w   h   T0  T1  T2  T3  T4  T5  T6  T7  T8  target_name target_name_length  target_name3
0   0   858580  23224814    1   58.133331   57.466675   181.000000  42.000000   NaN 1801    No, there are still more names to be marked Male    1881    John Abbott NaN NaN NaN John Abbott 11  John Abbott
1   1   858580  23224814    1   293.133331  176.466675  80.000000   34.000000   NaN NaN Yes, I've marked all the names  Female  NaN NaN Edith Joynt Edith Abbot NaN Edith Joynt 11  Edith Joynt
2   2   340348  30629031    1   152.968750  26.000000   224.000000  41.000000   NaN 1852    No, there are still more names to be marked Male    1924    William Sparrow NaN NaN NaN William Sparrow 15  William Sparrow
3   3   340348  30629031    1   497.968750  325.000000  87.000000   29.000000   NaN NaN Yes, I've marked all the names  Female  NaN NaN Minnie  NaN NaN Minnie  6   Minnie
4   4   340348  28613182    1   103.968750  31.000000   162.000000  38.000000   NaN 1819    No, there are still more names to be marked Male    1876    Albert [unclear]Gles[/unclear]  NaN NaN NaN Albert Gles 30  Albert Gles
5   5   340348  28613182    1   107.968750  76.000000   72.000000   25.000000   NaN 1819    Yes, I've marked all the names  Female  1884    NaN Eliza [unclear]Gles[/unclear]   NaN NaN Eliza Gles  29  Eliza Gles
6   6   340348  30628864    1   172.968750  29.000000   192.000000  41.000000   NaN 1840    No, there are still more names to be marked Male    1918    John Slaltery   NaN NaN NaN John Slaltery   13  John Slaltery
7   7   340348  30628864    1   115.968750  214.000000  149.000000  31.000000   NaN NaN No, there are still more names to be marked Male    NaN [unclear]P.[/unclear] Slaltery  NaN NaN NaN P. Slaltery 30  unclear]P. Slaltery
8   8   340348  30628864    1   537.968750  218.000000  64.000000   26.000000   NaN NaN Yes, I've marked all the names  Female  1901    NaN Elizabeth Slaltery  NaN NaN Elizabeth Slaltery  18  Elizabeth Slaltery

Here is the code we are trying to use:

If you want to blindly add the target column to dataframe1 then

dataframe1['target'] = dataframe2['target']

Just make sure that both the dataframes have same number of rows and they are sorted by any given common column. Eg: user_id is found in both the dataframes

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM