简体   繁体   中英

Data Cleaning with Time Series

I have a data cleaning question. I ran two experiments in a row without turning off the equipment. I want all my data from Experiment 1 to go in one csv, and all my data from Experiment 2 to go into a different csv. The most obvious demarcation between experiments is a longer time period, but unfortunately, this was never a fixed time period. Another possibility is to split the data by peaks in the tension data, and then to recombine them... somehow. Does anyone have any thoughts for an algorithm that might achieve this? Below is some mock-data. The time data is in a pandas DateTimeIndex.

# Experiment 1, Trial 1
DateTimeIndex  Tension
7/25/2020 9:32 0
7/25/2020 9:33 0
7/25/2020 9:34 24
7/25/2020 9:35 100
7/25/2020 9:36 50
7/25/2020 9:37 20
7/25/2020 9:38 0
#Noise
7/25/2020 9:39 -25
7/25/2020 9:40 4
7/25/2020 9:41 11
#Experiment 1: Trial 2
7/25/2020 9:43 2
7/25/2020 9:44 3
7/25/2020 9:45 25
7/25/2020 9:46 150
7/25/2020 9:47 60
7/25/2020 9:48 70
7/25/2020 9:49 2
# Lots and Lost of Noise Between Trials
#Experiment 2: Trial 1
7/25/2020 10:06 0
7/25/2020 10:07 0
7/25/2020 10:08 24
7/25/2020 10:09 100
7/25/2020 10:10 50
7/25/2020 10:11 20
7/25/2020 10:12 -3

You can find the peaks of the signal using scipy's function (find peaks). This function has a good heuristic of finding peaks, and you can play with its' parameters to use to your benefit. After finding the peaks, you can take these indices and iterate over adjacent indices to access your different segments. See attached example:

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import find_peaks
from scipy.signal import medfilt
data = np.sin(np.linspace(0, 8*np.pi))
indices = find_peaks(data)[0]
indices = np.unique(np.concatenate([[0, data.size-1], indices]))
for i in range(len(indices) - 1):
  i0, i1 = indices[i: i+2]
  plt.plot(np.arange(i0, i1 + 1), data[i0:i1 + 1])

The output: 输出 :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM