grouped data analysis in python with pandas

Question

I have a large dataframe. One of the columns is time (just integers representing seconds). I would like to do a groupBy where each group represents say 2 seconds of data. Doing this would allow me to use the std or mean functions on all of the groups with one line of code. The goal is to be able to throw out time increments of data that don't meet a certain criteria. the following pseudo code hopefully represents what I want to do. Please excuse the crudeness as i'm pretty new to pandas.

 grouped = df.groupBy(df['time'])  #grouped for say 2 second increments. 
 groupStd = grouped.std()
 df.drop( items in group where groupStd> val)
 convert back to dataframe after the rows have been removed.

If someone could help me fill in the blanks that would be extremely helpful. Thank you!

Answer 1

You can try :

import pandas as pd

df = pd.DataFrame([[22, 18], [21, 23], [20, 17], [23, 45]], columns=['time', 'value'])

def sub_group_hash(x):
    return (x / 2).astype(int) * 2

grouped = df.drop('time', axis=1).groupby(sub_group_hash(df['time']))
groupStd = grouped.mean()
print groupStd

grouped data analysis in python with pandas

Question

1 answers

solution1
0 2015-04-30 17:47:49

grouped data analysis in python with pandas

Question

1 answers

solution1 0 2015-04-30 17:47:49

solution1
0 2015-04-30 17:47:49