简体   繁体   English

如何估算传入的缺失值?

[英]How to impute the incoming missing value?

I have this following data:我有以下数据:

time_s  rpm_motor_1  rpm_motor_2  vibration
0     0.00       7200.0          0.0       0.56
1     0.02       7469.3          0.0       0.58
2     0.04       7774.8          0.0       0.62
3     0.10       8181.8          0.0       0.63
4     0.12       7948.0          0.0       0.60
5     0.14       7982.9          0.0       0.60
6     0.16       7146.3          0.0       0.54
7     0.18       6693.4          0.0       0.48
8     0.20       6389.0          0.0       0.41
9     0.20       6389.0          0.0       0.41
10    0.22       7144.1          0.0       0.0
11    0.24       7251.4          0.0       0.49
12    0.26       7014.1          0.0       0.49
13    0.28       6500.4          0.0       0.40
14    0.30       6261.6          0.0       0.32
15    0.32       6236.0          0.0       0.0
16    0.34       6391.2          0.0       0.40
17    0.36       6953.2          0.0       0.54
18    0.38       7202.0          0.0       0.54
19    0.40       6582.6          0.0       0.40
20    0.42       6967.0          0.0       0.55
21    0.44       6941.0          0.0       0.53
22    0.46       6288.7          0.0       0.40
23    0.48       6219.8          0.0       0.37
24    0.50       6648.6          0.0       0.41
25    0.52       6846.4          0.0       0.46
26    0.54       6571.8          0.0       0.47
27    0.56       7171.3          0.0       0.58
28    0.58       6779.0          0.0       0.51
29    0.60       7021.8          0.0       0.48
30    0.62       6795.6          0.0       0.42
31    0.64       6358.8          0.0       0.40
32    0.66       6917.0          0.0       0.42
33    0.68       6944.0          0.0       0.50
34    0.70       7149.2          0.0       0.0
35    0.72       7381.6          0.0       0.53
36    0.74       7383.5          0.0       0.49
37    0.76       6120.1          0.0       0.37
38    0.78       6185.4          0.0       0.35
39    0.80       6481.2          0.0       0.38
40    0.82       6390.4          0.0       0.31
41    0.84       7136.9          0.0       0.51
42    0.86       6740.2          0.0       0.51
43    0.88       7179.3          0.0       0.58
44    0.90       6910.7          0.0       0.46
45    0.92       6978.7          0.0       0.47
46    0.94       6625.7          0.0       0.46
47    0.96       6515.2          0.0       0.39
48    0.98       6649.9          0.0       0.45
49    1.00       6638.1          0.0       0.47

Some of the vibration values are 0.0.一些vibration值为 0.0。

When rpm is plotted against vibration, this is what it looks like.当 rpm 与振动作图时,这就是它的样子。

在此处输入图像描述

There is a direct correlation between an increase in rpm and an increase in vibration.转速增加与振动增加之间存在直接相关性。 The values at the bottom of the chart by the x-axis are the 0.0 values you see in the data frame.图表底部 x 轴的值是您在数据框中看到的 0.0 值。

My approach is to iterate through the data, when coming across vibration[i] = 0.0 , use the data that came before it to make an informed guess.我的方法是遍历数据,当遇到vibration[i] = 0.0时,使用它之前的数据做出明智的猜测。 I think a good way to impute this data would be to use KNN but I am not able to import sci-kit-learn我认为估算这些数据的一个好方法是使用 KNN,但我无法导入 sci-kit-learn

If you have a better approach at replacing the 0.0 values, I would love to hear it.如果您有更好的方法来替换 0.0 值,我很想听听。

You could use pandas' interpolate to get a linearly interpolated result:您可以使用pandas 的插值来获得线性插值结果:

df.replace({'vibration': {0.0: np.nan}}, inplace=True)
df.interpolate(inplace=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:如何在CSV文件中估算缺少的值? - Python: How to impute the missing value in a CSV file? 如何使用 KNN 估算缺失值 - How to impute missing values with KNN 在 Spider IDE 中估算和修复缺失值时出错 - ERROR to impute and fix the missing value in Spider IDE 如何用 python 中同一天和同一时间的平均值估算时间序列数据中的缺失值 - How to impute missing value in time series data with mean value of the same day and time in python 如何用python中Pandas中两个附近非零值的平均值来估算缺失值或具有0的值 - How to impute the missing value or value having 0 with the average of two nearby non-zero values in Pandas in python 如何用 python 中前一周(天)的同一天和同一时间的值来估算时间序列数据中的缺失值 - How to impute missing value in time series data with the value of the same day and time from the previous week(day) in python 如何在 pandas 的 2 layered group by 中“估算”缺失的项目 - How to "impute" missing item in 2 layered group by in pandas 如何根据其他变量估算缺失值 - How to impute missing values based on other variables 如何根据先前的值来估算缺失值? - How to impute the missing values depending on previous values? 用过去两天的平均值估算缺失值 - Pandas - Impute missing value with mean from last two days - Pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM