[英]Forecasting on each group in a Pandas dataframe
我有以下数据框:
Year_Month Country Type Data
2019_01 France IT 20
2019_02 France IT 30
2019_03 France IT 40
2019_01 France AT 10
2019_02 France AT 15
2019_03 France AT 20
我想分别预测 France & IT & France & AT 组合的 Year_Month“2019_04”。
因此,例如我应该得到如下结果:
预测(法国,IT):
Year_Month Country Type Data
2019_04 France IT 50
预测(法国,AT):
Year_Month Country Type Data
2019_04 France AT 25
应该如何设计循环,以便具有模型的函数可以一次针对每个组合运行并保存输出?
尽管你的问题中还有很多问题(你想用哪个模型来预测?你想预测未来多远?......),你可以从使用 scikit-learn 和 compute 的sklearn.linear_model
开始每种类型的预测:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
# Generate data from the example
df = pd.DataFrame({
'Year_Month': {0: '2019_01',1: '2019_02',2: '2019_03',3: '2019_01',4: '2019_02',5: '2019_03'},
'Country': { 0: 'France', 1: 'France', 2: 'France', 3: 'France', 4: 'France', 5: 'France'},
'Type': {0: 'IT', 1: 'IT', 2: 'IT', 3: 'AT', 4: 'AT', 5: 'AT'},
'Data': {0: 20, 1: 30, 2: 40, 3: 10, 4: 15, 5: 20}})
# Generate our empty regressor to fit the trend.
regressor = LinearRegression()
result = {}
# loop on every type
for t in df['Type'].unique():
# slice
df_slice = df[df['Type'] == t]
# train the regressor
regressor.fit(X=df_slice['Year_Month'].to_numpy().reshape(-1, 1), y=df_slice['Data'])
# predict new values
result[t] = {'predicted_value': regressor.predict(np.array([201904]).reshape(-1, 1))}
# build dataframe with all your results
final_df = pd.DataFrame(result)
# IT AT
# predicted_value [50.0] [25.0]
谢谢,对我有用的是 comboList=list(zip(Map['country'],Map['type']))
对于我,枚举(组合列表)中的组合:打印(组合)子集=数据[(数据['国家']==组合[0])&(数据['类型']==组合[1])]子集=子集[[“数据”]]
x_train_ts, y_train_ts, x_test_ts, y_test_ts = data(subset,10, 1)
trials = Trials()
best = fmin(create_model_hypopt,
space=search_space,
algo=tpe.suggest,
max_evals=1,
trials=trials)
loss=trials.losses()
loss.append(loss)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.