简体   繁体   English

H2O 中的热编码数据会以某种方式影响 model 吗?

[英]Will Hot encoded data in H2O effect the model somehow?

I have hot encoded data separately (there are multiple categories under a single main variable and 30 variables).我分别有热编码数据(单个主变量和30个变量下有多个类别)。 I want to know if this will effect GB, GL, DRF in H2O.我想知道这是否会影响 H2O 中的 GB、GL、DRF。 the documentation says for XGBOOST it internally encodes to one-hot For deep learning models i can may be use All factor parameter but I cannot find how to stop implicit hot encoding or let it be as the results will be same?文档说 XGBOOST 它在内部编码为单热对于深度学习模型,我可以使用所有因子参数,但我找不到如何停止隐式热编码或让它成为结果相同的方法?

I have read documentation and tutorial published by amazonaws, may be I am missing something.我已经阅读了 amazonaws 发布的文档和教程,可能是我遗漏了什么。

If you have categorical columns, you don't need to encode it.如果您有分类列,则无需对其进行编码。 You just need to make sure that that column is read in as enum and not int.您只需要确保该列是作为枚举而不是 int 读入的。 For Deeplearning, if you want to use all factors of the categorical columns, you just need to set the parameter use_all_factor_levels=True/true/TRUE for Python, Java or R.对于 Deeplearning,如果要使用分类列的所有因子,只需将参数 use_all_factor_levels=True/true/TRUE 设置为 Python、Java 或 R。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM