[英]Why doesn't TF Boosted Trees accept numerical data as input?
For for tf.estimator.BoostedTreesClassifier
, why do all feature columns required to be of type bucketsized
or indicator
column?对于tf.estimator.BoostedTreesClassifier
,为什么所有特征列都需要是bucketsized
或indicator
列的类型?
What is the best way to handle both the numerical, and categorical data that is used by the classifier?处理分类器使用的数值和分类数据的最佳方法是什么?
It just seems impossible to work with numerical data.处理数值数据似乎是不可能的。 Decision trees are perfect since I don't even need to scale my data.决策树是完美的,因为我什至不需要扩展我的数据。
My code is as follows:我的代码如下:
def _parse_record():
# do something
return {'feature_1': array[0], 'feature_2': array[190.98]}, label
def input_fn():
# parse record
return dataset
feature_cols = []
for _ in numerical_features:
feature_cols.append(tf.feature_column.numeric_column(key=_))
for _ in cat:
c = tf.feature_column.categorical_column_with_hash_bucket(key=_, hash_bucket_size=100)
ind = tf.feature_column.indicator_column(c)
feature_cols.append(ind)
classifier = tf.estimator.BoostedTreesClassifier(
feature_columns=feature_cols,
n_batches_per_layer=100,
n_trees=100,
)
f=lambda: input_fn()
classifier.train(input_fn=f)
However, this gives me:但是,这给了我:
ValueError: For now, only bucketized_column and indicator column are supported but got: _NumericColumn(key='active_time', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None) ValueError:目前,仅支持 bucketized_column 和指标列,但得到:_NumericColumn(key='active_time', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)
Support for numeric features in tf.estimator.BoostedTreesClassifier
has just been added in TensorFlow v1.13 ( source , commit ). TensorFlow v1.13 中刚刚添加了对tf.estimator.BoostedTreesClassifier
数字特征的支持( source , commit )。 The first stable release is v1.13.1 .第一个稳定版本是v1.13.1 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.