简体   繁体   中英

select rows with the same data from pandas dataframe

I have a pandas dataframe df

df
 StartDate                    EndDate Value  \
0   2015-03-25 12:25:43.999994 2015-03-25 13:23:43.979992     0   
1   2015-03-25 13:23:43.999998 2015-03-25 13:24:43.979998     1   
2   2015-03-25 13:24:43.999994 2015-03-25 13:25:43.979995     0   
3   2015-03-26 13:25:44.000001 2015-03-26 13:47:43.979996     0   
4   2015-03-26 13:47:43.999992 2015-03-26 13:48:43.979993     1   
5   2015-03-26 13:48:43.999999 2015-03-26 14:25:43.980001     0   
6   2015-03-27 14:25:43.999997 2015-03-27 15:25:43.979998     0   
7   2015-03-27 15:25:43.999994 2015-03-27 15:28:43.979997     0   
8   2015-03-27 15:28:43.999993 2015-03-27 15:29:43.979994     1   
9   2015-03-27 15:29:44.000000 2015-03-27 15:59:43.979997     0 

and I would like to compute some operation day by day... Therefore I would like to extract a sub-dataframe containing only the rows belonging to the first day, then the ones related to the second day etc etc..

I was planning to have a for loop and at each iteration select the rows of a particular day...

I calculate the uniques day

unique_days = df['StartDate'].map(lambda t: t.date()).unique()

and then start the loop...

# for each day compute operation 
for i in unique_day:
    print(i)
    df_day = df[df['StartDate'].map(lambda t: t.date()) == i]

    df2 = func(df_day,parameters)

I think the best is groupby by date s and apply some function like mean , sum or apply with custom function:

df1 = df.groupby(df['StartDate'].dt.date).mean()

df2 = df.groupby(df['StartDate'].dt.date).apply(func)

Sample:

#some sample function
def func(df_day,parameters):
    #print each group
    print (df_day)

    return df_day['StartDate'] - pd.Timedelta(parameters, unit='d')

df2 = df.groupby(df['StartDate'].dt.date).apply(lambda x: func(x, 1))
#less readable
#df2 = df.groupby(df['StartDate'].dt.date).apply(func, 1)
print (df2)
StartDate    
2015-03-25  0   2015-03-24 12:25:43.999994
            1   2015-03-24 13:23:43.999998
            2   2015-03-24 13:24:43.999994
2015-03-26  3   2015-03-25 13:25:44.000001
            4   2015-03-25 13:47:43.999992
            5   2015-03-25 13:48:43.999999
2015-03-27  6   2015-03-26 14:25:43.999997
            7   2015-03-26 15:25:43.999994
            8   2015-03-26 15:28:43.999993
            9   2015-03-26 15:29:44.000000
Name: StartDate, dtype: datetime64[ns]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM