简体   繁体   中英

Python DataFrame: Map two columns based on a condition?

I have a df with three columns no , name , date . If no and date are matched thier value will be added in output column to a list.

Ex: 1234 and 2 matched from 0 and 1 rows - so [a,b]

If no and date doesn't match add the same name value to list

      no       name   date

0   1234        a       2

1   1234        b       2

2   1234        c       3

3   456         d       1

4   456         e       2  

5   789         f       5

Resultant Output.

      no       name   date    output

0   1234        a       2     [a,b]

1   1234        b       2     [a,b]

2   1234        c       3      [c]

3   456         d       1      [d]

4   456         e       2      [e]

5   789         f       5      [f]

Another solution would be to combine groupby and transform with sum and list . You'd have to see about the performance though.

df['output'] = df.groupby(['no', 'date'])['name'].transform(sum).apply(list)

     no name date  output
0  1234    a    2  [a, b]
1  1234    b    2  [a, b]
2  1234    c    3     [c]
3   456    d    1     [d]
4   456    e    2     [e]
5   789    f    5     [f]

You can try groupby.agg(list) to get the list for each combination of no and date , then merge to assign back:

df.merge(df.groupby(['no','date'])['name']
           .agg(list).rename('output'),
         on=['no', 'date']
        )

Output:

     no name  date  output
0  1234    a     2  [a, b]
1  1234    b     2  [a, b]
2  1234    c     3     [c]
3   456    d     1     [d]
4   456    e     2     [e]
5   789    f     5     [f]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM