I have a dataframe. I would like to extract features based on a time window.
df = pd.DataFrame({'time':[1,2,3,4,5,6,7,8,9,10,2,3,5,6,8,10,12],
'id':[793,793,793,793,793,793,793,793,793,793,942,942,942,942,942,942,942],
'B1':[10,20,30,40,50,60,70,80,90,100,23,24,25,27,30,44,55],
'B2':[10,20,30,40,50,60,70,80,90,100,23,24,25,27,30,44,55],
'B3':[10,20,30,40,50,60,70,80,90,100,23,24,25,27,30,44,55]})
time_window = pd.DataFrame({'time':[2,4,6,8,5,8], 'id':[793,793,793,793,942,942]})
Here, my time window is
[2,4]--> for participant 793 [6,8]--> for participant 793 [5,8]--> for participant 942
My goal is to extract the features on the specified time window for each participant. Therefore, I wrote a function
from tsfresh import extract_features
def apply_tsfresh(col):
for i in range(len(time)):
col.loc[time_window[i]:time_window[i+1]] = extract_features(col.loc[time_window[i]:time_window[i+1]], column_id="id")
return col
extracted_freatures = df.set_index('time').apply(apply_tsfresh)
It will extract the features based on the specified time window for each participant. However, I am not getting any results. It provides me an error.
Could you please help me here? I am totally out of any ideas.
My desired output should be look like as: desired result
*Here, the extracted features maybe more than just two. Also the extracted features values maybe different. I am just giving you an example.
Initially, an empty dataframe is created 'extracted_freatures_'. A cycle is created, step two. Elements are taken from the dataframe 'time_window' column 'time'. The results from 'extract_features' are attached to the 'extract_features' dataframe. Don't ask me how 'tsfresh' works, I don't know.
extracted_freatures_ = pd.DataFrame()
df = df.set_index('time')
for i in range(0, len(time_window['time']), 2):
ind1 = time_window.loc[i, 'time']
ind2 = time_window.loc[i+1, 'time']
a = extract_features(df.loc[[ind1, ind2]], column_id="id")
extracted_freatures_ = pd.concat([extracted_freatures_, a])
print(extracted_freatures_)
Output
Feature Extraction: 100%|██████████| 6/6 [00:00<00:00, 36.71it/s]
Feature Extraction: 100%|██████████| 6/6 [00:00<00:00, 39.50it/s]
Feature Extraction: 100%|██████████| 6/6 [00:00<00:00, 40.81it/s]
B2__variance_larger_than_standard_deviation ... B3__mean_n_absolute_max__number_of_maxima_7
793 1.0 ... NaN
942 0.0 ... NaN
793 1.0 ... NaN
942 1.0 ... NaN
793 1.0 ... NaN
942 1.0 ... NaN
[6 rows x 2367 columns]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.