简体   繁体   中英

data segmentation time series

How can I seperate the sequences which represent important data from the unimportant ones?

Some background and an example: As it can be seen in the data plot (Figure - see link), there are 9 segements in this time series, which was recorded with an IMU (measures acceleration – x,y,z, orientation rotation around x,y,z). The figure can be find in these links:

Figure: Snipping gesture acceleration x,y,z

the data on which the plot relies on: Data.csv

In this case these represent a snipping movement from the right hand. Between each signal there is a delay (2-3 seconds). This delay can also be extended

Which approach is the easiest to segment the data? Where can I find examples for these or could you give me a simple one? What do I want to find out?: Where are the starting points of the relevant signals?

  • Anomaly detection. I already implemented this, but it predicts very vague (I haven'nt optimized it yet). Probably I need better features then only the raw data. I'm asking this question since there might be simpler methods.
  • K-means clustering: Thought about this aswell, but how do I approach this? Are there examples?
  • Frequency domain analysis: Segment the raw data into frames (size 100) with an overlap and transform these into the frequency domoin. Which features could I use? Thought about signal energy?
  • other approaches?

Afterwards these segments will be used as training examples for a gesture classifier.

[ Optional additional information: Recording environment: Hand hangs loose -> gesture gets performed --> hand hangs loose --> wait for 5 Sec in the loose arm position --> [next iteration in recording a gesture]. Another important condition is also that I need to segment different kinds of gestures (the signals look different), not only a snipping gesture, but also swipe up, swipe down or thumbs up are possible.]

Thank you very much in advance :)

greets Max

there is some possible solution with: DBSCAN example here or DTW example here methods.

Also, you can do it manually (for example Matlab m-file). Here: axaR is your signal from accelerometer (along x-axis), k - number of points, threshold - manually adjutable value

figure (100)

for fig = 1: 5
k = 200;
threshold =20*fig*std(axaR);
fprintf('window size %d, threshold is %f\n', k,threshold)  
for i = 1: (length(axaR)-k)
    summa = 0;
    for j= 1:k
    summa =  summa + abs(axaR(i+j));
    end
    if (summa > threshold)
        c(i) = 1;
    else
        c(i) = 0;

    end
end

subplot(5,1,fig)
plot ( axaR, 'LineWidth', lw), hold on
plot(1:k,ones(k,1),'*r')
hold on, 
plot ( c, 'LineWidth', lw),
hold on, 
xlim([0,5000]),
title (strcat('threshold ', num2str(threshold), 'window size', num2str(k) ))
end

Matlab figure that illustrates the algorithm

This is the example from walking analysis, it is completely intuition based approach, please if you know any math or physics background, do not hesitate to share it.

Best

“How can I seperate the sequences which represent important data from the unimportant ones?” Your question is ill-defined. What is “important” is subjective, and not intrinsic to the data.

However, if you want to build a classifier, you can reframe the question as “what is the best conserved subsequence?” . That you can answer with:

[matrixProfile, profileIndex, motifIndex, discordIndex] = interactiveMatrixProfileVer2( ay,250); (the code is free at http://www.cs.ucr.edu/~eamonn/MatrixProfile.html )

如果您想手动分割数据(监督学习),我遇到了相同的问题,因此我创建了一个简单的python matplotlib库: https : //github.com/XavierTolza/python-timeseries-segmenter

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM