简体   繁体   中英

Insert Missing Months rows in the dataframe in python

I have the following dataframe:

Input:-

ID month   Name
A1 2017.01 A
A1 2017.02 B
A1 2017.04 C
A2 2017.02 A
A2 2017.03 D
A2 2017.05 C

Output:-

ID month   Name
A1 2017.01 A
A1 2017.02 B
A1 2017.03 B
A1 2017.04 C
A2 2017.02 A
A2 2017.03 D
A2 2017.04 D
A2 2017.05 C

I require to get the missing months in the sequence and the value of the month preceding it and which is present in the input list. Consider example of ID "A1". "A1" has months 1,2,4 and has missing month 3. So i need to add the row with value "A1" as ID, month as "2017.03" and Name as "B". Please note the "Name" column should get its value from the row immediately above it that is present in the input.

How do I achieve this in pandas, or by any other method in python.

Any help is appreciated! Thanks

Let's try this with @EFT's suggestion:

df['Date'] = pd.to_datetime(df.month.astype(str),format='%Y.%m')
df_out = df.set_index('Date').groupby('ID').resample('MS').asfreq().ffill().reset_index(level=0, drop=True)
df_out = df_out.reset_index()
df_out['month'] = df_out.Date.dt.strftime('%Y.%m')
df_out = df_out.drop('Date',axis=1)
print(df_out)

Output:

   ID    month Name
0  A1  2017.01    A
1  A1  2017.02    B
2  A1  2017.03    B
3  A1  2017.04    C
4  A2  2017.02    A
5  A2  2017.03    D
6  A2  2017.04    D
7  A2  2017.05    C

There was a question in the comments about how df knows which column to ffill, and I just decided to go through it and post it here, maybe someone finds it useful (or I use it as a reference for myself):

mytest = pd.DataFrame({'ID': ['A1', 'A1', 'A1', 'A2', 'A2', 'A2'], 'month': ['2017.01', '2017.02', '2017.04', '2017.02', '2017.03', '2017.05'], 'Name':['A','B','C','A','D','C']})
mytest.month = pd.to_datetime(mytest.month)
mytest=mytest.set_index('month').groupby(['ID'])
mytest = mytest.resample('MS').asfreq()['Name']
mytest = pd.DataFrame(pd.DataFrame(mytest).to_records())
mytest.Name = mytest.Name.ffill()
mytest

Obviously outputs a very similar thing, I just have not formatted months back to the original format.

    ID  month   Name
0   A1  2017-01-01  A
1   A1  2017-02-01  B
2   A1  2017-03-01  B
3   A1  2017-04-01  C
4   A2  2017-02-01  A
5   A2  2017-03-01  D
6   A2  2017-04-01  D
7   A2  2017-05-01  C

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM