简体   繁体   中英

Time series with appending issue in pandas data frame

I was working on time series, I found very peculiar behavior in pandas data frame

Following code works when index is not time series

import pandas as pd
df = pd.DataFrame({"a":[1,2,3], "b":[31,41,51],"c":[31,52,23]}, index=["z", "y", "x"])
df1 = pd.DataFrame({"a":[41,55,16]}, index=["w", "v", "u"])
df2 = pd.DataFrame({"b":[24,3,57]}, index=["w", "v", "u"])
df3 = pd.DataFrame({"c":[111,153,123]}, index=["w", "v", "u"]) 
df = df.append(df1)
dfx.ix[df2.index, "b"] = df2

output for df :

    a   b   c
z   1  31  31
y   2  41  52
x   3  51  23
w  41  24 NaN
v  55   3 NaN
u  16  57 NaN

However, this does not work when there is datetime64[ns] index or when size is too big

In addition following command works, when there is datetime64[ns] index

df = df.append(df1)
df["b"][df2.index] = df2

Why it is ?

You can try fillna :

df = df.append(df1)
print df.fillna(df2)
    a   b   c
z   1  31  31
y   2  41  52
x   3  51  23
w  41  24 NaN
v  55   3 NaN
u  16  57 NaN

I test it with Datatimeindex and it works very well:

import pandas as pd

df = pd.DataFrame({"a":[1,2,3], "b":[31,41,51],"c":[31,52,23]}, index=["z", "y", "x"])
df.index = pd.date_range('20160101',periods=3,freq='T')

df1 = pd.DataFrame({"a":[41,55,16]}, index=["w", "v", "u"])
df1.index = pd.date_range('20160104',periods=3,freq='T')

df2 = pd.DataFrame({"b":[24,3,57]}, index=["w", "v", "u"])
df2.index = pd.date_range('20160104',periods=3,freq='T')

df3 = pd.DataFrame({"c":[111,153,123]}, index=["w", "v", "u"])
df3.index = pd.date_range('20160104',periods=3,freq='T')
df = df.append(df1)
print df
                      a   b   c
2016-01-01 00:00:00   1  31  31
2016-01-01 00:01:00   2  41  52
2016-01-01 00:02:00   3  51  23
2016-01-04 00:00:00  41 NaN NaN
2016-01-04 00:01:00  55 NaN NaN
2016-01-04 00:02:00  16 NaN NaN

print df.fillna(df2)
                      a   b   c
2016-01-01 00:00:00   1  31  31
2016-01-01 00:01:00   2  41  52
2016-01-01 00:02:00   3  51  23
2016-01-04 00:00:00  41  24 NaN
2016-01-04 00:01:00  55   3 NaN
2016-01-04 00:02:00  16  57 NaN

df.ix[df2.index, "b"] = df2
print df
                      a   b   c
2016-01-01 00:00:00   1  31  31
2016-01-01 00:01:00   2  41  52
2016-01-01 00:02:00   3  51  23
2016-01-04 00:00:00  41  24 NaN
2016-01-04 00:01:00  55   3 NaN
2016-01-04 00:02:00  16  57 NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM