简体   繁体   中英

how to add add new row for each distinct ID in pandas?

I have a datframe like this:

import pandas as pd
df = pd.DataFrame({'Car_ID': ['B332', 'B332', 'B332', 'C315', 'C315', 'C315', 'C315', 'C315', 'F310', 'F310'], \
                    'Date': ['2018-03-15', '2018-03-14', '2018-03-12', '2018-03-15', '2018-03-14', '2018-03-13', \
                             '2018-03-12', '2018-03-11', '2018-03-10', '2018-03-09'], \
                    'Driver': ['Alex', 'Alex', 'Alex', 'Sara', 'Sara', 'Sara', 'Sara', 'Sara', 'Franck','Franck'], \
                  'Info': ["Group_B", "Group_B", "Group_B", "Group_C", "Group_C", "Group_C", "Group_C", "Group_C", "Group_F", "Group_F"]})
df

    Car_ID  Date        Driver  Info
0   B332    2018-03-15  Alex    Group_B
1   B332    2018-03-14  Alex    Group_B
2   B332    2018-03-12  Alex    Group_B
3   C315    2018-03-15  Sara    Group_C
4   C315    2018-03-14  Sara    Group_C
5   C315    2018-03-13  Sara    Group_C
6   C315    2018-03-12  Sara    Group_C
7   C315    2018-03-11  Sara    Group_C
8   F310    2018-03-10  Franck  Group_F
9   F310    2018-03-09  Franck  Group_F

I want to add new row before each distinct Car_ID like this:

    Car_ID  Date        Driver  Info
0   B332    2018-03-15  Alex    Group_B
1   B332    2018-03-14  Alex    Group_B
2   B332    2018-03-12  Alex    Group_B
3   B332    2018-03-12  Alex    Changed
4   C315    2018-03-15  Sara    Group_C
5   C315    2018-03-14  Sara    Group_C
6   C315    2018-03-13  Sara    Group_C
7   C315    2018-03-12  Sara    Group_C
8   C315    2018-03-11  Sara    Group_C
9   C315    2018-03-11  Sara    Changed
10  F310    2018-03-10  Franck  Group_F
11  F310    2018-03-09  Franck  Group_F
12  F310    2018-03-09  Franck  Changed

How could I do this job by shift() ?

Thanks

Inserting rows is expensive. You can use groupby + last , concatenate two dataframes, and then sort_values :

df_last = df.groupby('Car_ID', as_index=False).last().assign(Info='Changed')

res = pd.concat([df, df_last], ignore_index=True)\
        .sort_values('Car_ID')\
        .reset_index(drop=True)

print(res)

   Car_ID        Date  Driver     Info
0    B332  2018-03-15    Alex  Group_B
1    B332  2018-03-14    Alex  Group_B
2    B332  2018-03-12    Alex  Group_B
3    B332  2018-03-12    Alex  Changed
4    C315  2018-03-15    Sara  Group_C
5    C315  2018-03-14    Sara  Group_C
6    C315  2018-03-13    Sara  Group_C
7    C315  2018-03-12    Sara  Group_C
8    C315  2018-03-11    Sara  Group_C
9    C315  2018-03-11    Sara  Changed
10   F310  2018-03-10  Franck  Group_F
11   F310  2018-03-09  Franck  Group_F
12   F310  2018-03-09  Franck  Changed

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM