Why doesn't TF Boosted Trees accept numerical data as input?

Question

For for tf.estimator.BoostedTreesClassifier , why do all feature columns required to be of type bucketsized or indicator column?

What is the best way to handle both the numerical, and categorical data that is used by the classifier?

It just seems impossible to work with numerical data. Decision trees are perfect since I don't even need to scale my data.

My code is as follows:

def _parse_record():
    # do something
    return {'feature_1': array[0], 'feature_2': array[190.98]}, label

def input_fn():
    # parse record
    return dataset

feature_cols = []
for _ in numerical_features:
    feature_cols.append(tf.feature_column.numeric_column(key=_))
for _ in cat:
    c = tf.feature_column.categorical_column_with_hash_bucket(key=_, hash_bucket_size=100)
    ind = tf.feature_column.indicator_column(c)
    feature_cols.append(ind)

classifier = tf.estimator.BoostedTreesClassifier(
    feature_columns=feature_cols,
    n_batches_per_layer=100,
    n_trees=100,
)

f=lambda: input_fn()
classifier.train(input_fn=f)

However, this gives me:

ValueError: For now, only bucketized_column and indicator column are supported but got: _NumericColumn(key='active_time', shape=(1,), default_value=None, dtype=tf.float32, normalizer_fn=None)

Answer 1

Support for numeric features in tf.estimator.BoostedTreesClassifier has just been added in TensorFlow v1.13 ( source , commit ). The first stable release is v1.13.1 .

Why doesn't TF Boosted Trees accept numerical data as input?

Question

1 answers

solution1
2 ACCPTED 2019-02-25 10:12:48

Why doesn't TF Boosted Trees accept numerical data as input?

Question

1 answers

solution1 2 ACCPTED 2019-02-25 10:12:48

solution1
2 ACCPTED 2019-02-25 10:12:48