简体   繁体   中英

Select rows from pandas dataframe with dates

Given a simple data frame

df = pd.DataFrame(np.random.rand(5,3))

I can select the records with the labels 1 and 3 using

df.loc[[1,3]]

But, if I change alter the index so it uses dates...

df.index = pd.date_range('1/1/2010', periods=5)

this no longer works:

df.loc[['2010-01-02', '2010-01-04']]

KeyError: "None of [['2010-01-02', '2010-01-04']] are in the [index]"

How can .loc be used with dates in this context?

One possible solution is convert dates to DatetimeIndex or to_datetime and then it works nice:

print (df.loc[pd.DatetimeIndex(['2010-01-02', '2010-01-04'])])

                   0         1         2
2010-01-02  0.827821  0.285281  0.781960
2010-01-04  0.872664  0.895636  0.368673

print (df.loc[pd.to_datetime(['2010-01-02', '2010-01-04'])])

                   0         1         2
2010-01-02  0.218419  0.806795  0.454356
2010-01-04  0.038826  0.741220  0.732816

You can use the boolean mask from isin :

In [151]:
df[df.index.isin(['2010-01-02', '2010-01-04'])]

Out[151]:
                   0         1         2
2010-01-02  0.939004  0.236200  0.495362
2010-01-04  0.254485  0.345047  0.273453

Unfortunately partial datetime string matching with a list won't work currently so either this or actual datetime values need to be passed

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM