I have a DataFrame with segments,timestamps and different columns
Segment Timestamp Value1 Value2 Value2_mean
0 2018-11... 180 156 135
0 170 140 135
0 135
1
1
...
I want to aggregate/group this DataFrame with 'Segment' and get the first Timestamp for a segment as soon as this intervall condition is met and then the time intervall in seconds for this segment. Because there are more values for a function, aggregate does not work I think.
value2_mean-std(value2) <= value1 <= value2_mean+std(value2)
It should look like this:
Segment Intervall[s]
0 10
1 19
2 6
3 ...
I tried something like this:
grouped = dataSeg.groupby(['Segment'])
def grouping(df)
a = np.array(df['Value_1'])
b = np.array(df['Value2'])
c = np.array(df['Value2_mean'])
d = np.array(df['Timestamp'])
for x in a:
categories = np.logical_and(
(c-np.std(b)<= x),
(c+np.std(b)>= x))
if np.any(categories):
return d[categories]-d[0]
grouped.apply(grouping)
This does not work the way I want it to. Any suggestions would be appreciated!
Something like this? I didn't test it thoroughly.
def calc(grp):
if grp.Value1.sub(grp.Value2_mean).abs().lt(grp.Value2.std()).any():
return grp["Timestamp"].iloc[-1] - grp["Timestamp"].iloc[0]
return np.nan
df.groupby("Segment").apply(calc)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.