简体   繁体   中英

How to combine multiple rows to one row

In Python DataFrame, for one MemberID, I had multiple rows where there are some null for certain columns like below:

   Date   MemberID    Name      Education      Occupation    Gender
0  2017/01  001         A          Nan            Student      M
1  2017/02  001         A          Graduate         Nan        M
2  2017/03  001         A          Nan            Physician    M
3  2017/01  002         B          College          Nan        F
4  2017/02  002         B          Nan            Professor    Nan
5  2017/03  002         B          PHD              Nan        F

I would like to clean the data with output as below:

Fill the NULL value with latest information for the same MemberID.

   Date    MemberID    Name    Education      Occupation    Gender
0  2017/03   001         A      Graduate       Physician       M
1  2017/03   002         B      PHD            Professor       F

Thanks.

You can use .groupby and .last :

df.groupby('MemberID').last()

output:

            Date    Name    Education   Occupation  Gender
MemberID                    
     1     2017/03    A      Graduate   Physician      M
     2     2017/03    B           PHD   Professor      F

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM