[英]How can I return a dictionary of multiple features, from a tf.data.Dataset, created from generator?
我有一个示例数据集,如下所示:
feature_1 feature_2 label
4 5 1
4 3 1
4 6 2
...
我为每个功能(feature_1和feature_2)创建了一个tf.feature_column.embedding_column
,所以我必须从train_input_fn
返回一个功能字典,其中的键与这些功能具有相同的名称。 我的输入函数如下:
def train_input_fn(features, labels, output_types, output_shapes, batch_size, feature_names):
"""
Provides the data pipeline for the training process.
:param features: (numpy.array) A numpy array that holds the training features.
:param labels: (numpy.array) A numpy array that holds the target variable.
:param output_types: (tuple(tensorflow.DType)) A tuple containing the data type of each component yielded.
:param output_shapes: (tuple(tensorflow.TensorShape)) A tuple containing the shape of each component yielded.
:param batch_size: (int) The size of every batch.
:return: (dict, int) A dictionary of key -> value for every feature and the target label.
"""
def gen():
for f, l in zip(features, labels):
yield f, l
ds = tf.data.Dataset.from_generator(gen, output_types, output_shapes)
# If we do repeat without any argument we actually create and infinite loop.
# That is preferred, we can now control the iterations via epochs.
ds = ds.repeat().batch(batch_size)
feature, label = ds.make_one_shot_iterator().get_next()
return {'feature': feature}, label
我如何退回类似的内容:
{'feature_1': x_1, 'feature_2': x_2}
这几处更改应该可以做到:
def train_input_fn(features, labels, output_types, output_shapes, batch_size, feature_names):
"""
Provides the data pipeline for the training process.
:param features: (numpy.array) A numpy array that holds the training features.
:param labels: (numpy.array) A numpy array that holds the target variable.
:param output_types: (tuple(tensorflow.DType)) A tuple containing the data type of each component yielded.
:param output_shapes: (tuple(tensorflow.TensorShape)) A tuple containing the shape of each component yielded.
:param batch_size: (int) The size of every batch.
:return: (dict, int) A dictionary of key -> value for every feature and the target label.
"""
def gen():
for f, l in zip(features, labels):
yield f, l
ds = tf.data.Dataset.from_generator(gen, output_types, output_shapes)
# If we do repeat without any argument we actually create and infinite loop.
# That is preferred, we can now control the iterations via epochs.
ds = ds.repeat().batch(batch_size)
feature, label = ds.make_one_shot_iterator().get_next()
return {'feature_1': feature[:, 0], 'feature_2': feature[:, 1]}, label
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.