[英]Is there a way to parallelize a loop for ensemble learning in Python?
I want to train multiple LightGBM models simultaneously.我想同时训练多个 LightGBM 模型。
Right now, I'm training them sequentially like below:现在,我正在按如下顺序训练它们:
for m in range(ensemble_n):
params = {'seed':m}
model = lgb.train(params, lgbtrain)
prediction=model.predict(test_df.drop([target], axis=1))
test_predictions[:, m] = prediction
Is there a way for me to parallelize the loop above?有没有办法让我并行化上面的循环?
Training multiple versions of a model in parallel comes at a cost that you need to have multiple versions of the data loaded into memory, which can get difficult if you have a sizeable dataset.并行训练 model 的多个版本的代价是您需要将多个版本的数据加载到 memory 中,如果您有一个相当大的数据集,这可能会变得很困难。
At the same time, if you're using a scikit-learn API of LGBM, you can utilise the parameter n_jobs=-1
which will parallelize calculations of a single model over all available cores.同时,如果您使用的是 LGBM 的 scikit-learn API,您可以使用参数n_jobs=-1
,它将在所有可用内核上并行计算单个 model。 This would be a more efficient use of resources, because regardless you'll have to chose either to train multiple models in parallel or to train a single model in parallel, but not both.这将更有效地利用资源,因为无论您必须选择并行训练多个模型还是并行训练单个 model,但不能同时训练两者。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.