简体   繁体   中英

Combining different dataframes columns into new dataframe and bonus filtering question

Im trying to create a new dataframe from two other dataframes and I think the indexing is messing me up. Might be a chaining operations issue from what I have been reading, but the answer I am seeing is to use iloc which I did but am still seeing the error.

I have original dataframe sorted by date index

df.head()

                          open  high    low close   volume  returns returns_final
Datetime                            
2020-07-06 09:30:00 255.337982  261.950012  253.208786  261.421997  6592145 -6.084015   1
2020-07-06 11:00:00 261.526001  268.399994  261.239990  266.275452  4955678 -4.749451   1
2020-07-06 12:30:00 266.269043  266.989990  264.200012  265.191986  2002640 1.077057    -1
2020-07-06 14:00:00 265.185455  269.558014  261.597992  268.513763  3303263 -3.328308   1
2020-07-06 15:30:00 268.528015  275.558014  268.096008  274.200012  2583149 -5.671997   1

Created some filters for the new dataframes

# Creating filter for time frame
df_inc = df.filter(like='09:30', axis=0)
df_inc_11 = df.filter(like='11:00', axis=0)

I am having a really hard time combining the two frames. Im pretty sure the indexing is causing all the problems.

newer = df_inc.filter(['open','close'], axis=1)

newer.head()
                        open    close
Datetime        
2020-07-06 09:30:00 255.337982  261.421997
2020-07-07 09:30:00 281.002014  277.621979
2020-07-08 09:30:00 281.000000  278.865784
2020-07-09 09:30:00 279.398010  272.015991
2020-07-10 09:30:00 278.220367  283.506012

Trying to add one Column from other dataframe.

df_inc_11.iloc[:, 3:4].head()

                      close
Datetime    
2020-07-06 11:00:00 266.275452
2020-07-07 11:00:00 278.123718
2020-07-08 11:00:00 278.633118
2020-07-09 11:00:00 274.414978
2020-07-10 11:00:00 282.440613

newer['new_close'] = df_inc_11.iloc[:, 3:4]

newer.head()
                        open    close   new_close
Datetime            
2020-07-06 09:30:00 255.337982  261.421997  NaN
2020-07-07 09:30:00 281.002014  277.621979  NaN
2020-07-08 09:30:00 281.000000  278.865784  NaN
2020-07-09 09:30:00 279.398010  272.015991  NaN
2020-07-10 09:30:00 278.220367  283.506012  NaN

I also tried to delete the index of the 2nd frame before copying it over but no go. Keep getting a NaN.

df_inc_11 = df_inc_11.reset_index(drop=True)

Any idea how I can fix the NaN?

On a side note, is there a better way to combine both of these searches into one filter? I'm thinking that might sort out the indexing issue.

# Creating filter for time frame
df_inc = df.filter(like='09:30', axis=0)
df_inc_11 = df.filter(like='11:00', axis=0) 

尝试

newer['new_close'] = df_inc_11.close.values

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM