简体   繁体   中英

Concatenate a blank df to a list of dataframes

I have the following data:

import pandas as pd
import numpy as np
from numpy import rec, nan
    
df1=pd.DataFrame.from_records(rec.array([(202001L, 2020L, 'apples', 'CA', 100L),(202002L, 2020L, 'apples', 'CA', 150L),(202001L, 2020L, 'apples', 'USA', 400L),(202002L, 2020L, 'apples', 'USA', 675L),        ], 
 dtype=[('yyyymm', '<i8'), ('year', '<i8'), ('prod', 'O'), ('country', 'O'), ('rev', '<i8')]))
    
df2=df1[['yyyymm','prod','country','rev']].sort_values(['yyyymm','prod','country'])
df2['yyyymm'] = df2['yyyymm'].astype(str)
        
twenty_eight_days=['02']
thirty_days=['09','04','06','11']
thirty_one_days=['01','03','05','07','08','10','12']
        
df2['mth_nbr']=df2['yyyymm'].str[-2:]
df2['year_nbr']=df2['yyyymm'].str[:4]
df2.loc[df2['mth_nbr'].isin(twenty_eight_days), 'last_day']= '28'
df2.loc[df2['mth_nbr'].isin(thirty_days), 'last_day'] = '30'
df2.loc[df2['mth_nbr'].isin(thirty_one_days), 'last_day'] = '31'
        df2['date']=pd.to_datetime(df2['year_nbr']+df2['mth_nbr']+df2['last_day'])
df2.set_index(df2['date'],inplace=True)
df2.drop(['yyyymm','mth_nbr','year_nbr','last_day','date'], axis=1, inplace=True)  
df2

在此处输入图像描述

I break out df2 by country: df2_CA and df2_USA. I want to attach a blank dataframe called df_forecast to each of df2_CA & df2_USA so I can forecast:

# 1. : Create 2 new dfs by country: df2_CA and df2_USA:
#######################################################
for x in df2['country'].unique():
    locals()['df2_' + x ] = df2[(df2['country'] == x ) ]   

    
#2. Create a forecasting dataframe:
###################################
index = pd.date_range('2020-03-30', periods=3, freq='M')
columns = ['prod','country','rev']
df_forecast = pd.DataFrame(index=index, columns=columns)


#3. Append df_forecast to each of df2_CA, df2_USA:
#################################################

df_list=[df2_CA, df2_USA]

for x in df_list:
    x=pd.concat([x, df_forecast]) 
    x['prod'].fillna(method='ffill',inplace=True)     
    x['country'].fillna(method='ffill',inplace=True)
    x.loc['2020-03':'2020-05', 'rev'] = 200

However if I look at df2_CA, df_forecast is not appending:

在此处输入图像描述

When I code df2_CA (and df2_USA) separately I get the desired result:

在此处输入图像描述

What is wrong with the for-loop in step 3 above?

Have you made sure all dataframes have the same columns?

Because your dataframe is blank (ie just header no rows) the concatenation has nothing to add to the existing dataframe.

You could add a first row with NaN values for instance and then edit these values later on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM