[英]Simple linear regression of two dataframe python
I have two dataframe with datetime index.我有两个带有日期时间索引的数据框。 the first dataframe may contain nan value while the second are not.
第一个数据帧可能包含 nan 值,而第二个则不包含。
data1['A']
2019-06-01 00:00:00 NaN
2019-06-01 01:00:00 NaN
2019-06-01 02:00:00 NaN
2019-06-01 03:00:00 NaN
2019-06-01 04:00:00 NaN
...
2019-06-30 19:00:00 14.086600
2019-06-30 20:00:00 14.101033
2019-06-30 21:00:00 14.160733
2019-06-30 22:00:00 13.940633
2019-06-30 23:00:00 13.989567
Freq: H, Name: A, Length: 720, dtype: float64
data2['B']
2019-06-01 00:00:00 243.168989
2019-06-01 01:00:00 243.104673
2019-06-01 02:00:00 242.571222
2019-06-01 03:00:00 240.685214
2019-06-01 04:00:00 242.652392
...
2019-06-30 19:00:00 243.611821
2019-06-30 20:00:00 243.338931
2019-06-30 21:00:00 243.296361
2019-06-30 22:00:00 243.676107
2019-06-30 23:00:00 243.507886
Name: B, Length: 720, dtype: float64
how can I can conduct a simple linear regression model of those two dataframes when both have value at certain datetime (without nan).当这两个数据框在特定日期时间(没有 nan)都有值时,我如何才能对这两个数据框进行简单的线性回归模型。 thanks for the help!
谢谢您的帮助!
You can use LinearRegression
from scikit-learn
:您可以使用
scikit-learn
LinearRegression
:
https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
您必须首先将数据框合并为一个数据框https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html ,然后在合并后的 scikit-learn 中尝试 LinearRegression数据框。
You can try something like this, you did not specify what to regress against what, so in the example below, I let x be 'A' and y be 'B':你可以尝试这样的事情,你没有指定什么回归什么,所以在下面的例子中,我让 x 为“A”,y 为“B”:
from sklearn import linear_model
import pandas as pd
import numpy as np
data1 = pd.DataFrame({'A':[np.NaN,np.NaN,np.NaN,np.NaN,14.086600,14.101033,14.160733,13.940633,13.989567]})
data2 = pd.DataFrame ({ 'B':[243.168989,243.104673,242.571222,240.685214,242.652392,
243.611821,243.338931,243.296361,243.676107,243.507886]})
is_finite = np.isfinite(data1['A']) & np.isfinite(data2['B'])
mdl = linear_model.LinearRegression()
mdl.fit(data1.loc[is_finite][['A']],data2.loc[is_finite]['B'])
mdl.coef_
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.