在float32中读取二进制数据

Question

我想根据时间信号的特征使用Tensorflow训练网络。 数据在E 3秒的时间段中进行分割，每个时间段具有F功能。 因此，数据具有以下形式

Epoch | Feature 1 | Feature 2 | ... | Feature F |
-------------------------------------------------
1     | ..        | ..        |     | ..        |
      | ..        | ..        |     | ..        |
E     | ..        | ..        |     | ..        |

将数据加载到Tensorflow时，我尝试遵循cifar示例并使用tf.FixedLengthRecordReader 。 因此，我获取了数据，并将其保存到float32类型的二进制文件中，第一个历元的第一个标签，然后是第一个历元的F功能，然后是第二个， float32 。

但是，将其读入Tensorflow对我来说是一个挑战。 这是我的代码：

def read_data_file(file_queue):

    class DataRecord(object):
        pass

    result = DataRecord()

    #1 float32 as label => 4 bytes
    label_bytes = 4

    #NUM_FEATURES as float32 => 4 * NUM_FEATURES
    features_bytes = 4 * NUM_FEATURES

    #Create the read operator with the summed amount of bytes
    reader = tf.FixedLengthRecordReader(record_bytes=label_bytes+features_bytes)

    #Perform the operation
    result.key, value = reader.read(file_queue)

    #Decode the result from bytes to float32
    value_bytes = tf.decode_raw(value, tf.float32, little_endian=True)

    #Cast label to int for later
    result.label = tf.cast(tf.slice(value_bytes, [0], [label_bytes]), tf.int32)

    #Cast features to float32
    result.features = tf.cast(tf.slice(value_bytes, [label_bytes],
        [features_bytes]), tf.float32)

    print ('>>>>>>>>>>>>>>>>>>>>>>>>>>>')
    print ('%s' % result.label)
    print ('%s' % result.features)
    print ('>>>>>>>>>>>>>>>>>>>>>>>>>>>')

打印输出为：

Tensor("Cast:0", shape=TensorShape([Dimension(4)]), dtype=int32)
Tensor("Slice_1:0", shape=TensorShape([Dimension(40)]), dtype=float32)

令我惊讶的是，因为我已将值强制转换为float32，所以我希望尺寸分别为1和10，它们是实际数字，但它们分别为4和40，与字节长度相对应。

怎么会？

Answer 1

我认为问题源于以下事实： tf.decode_raw(value, tf.float32, little_endian=True)返回类型为tf.float32的向量，而不是字节向量。 用于提取特征的切片大小应指定为浮点值的计数（即NUM_FEATURES ），而不是字节数（ features_bytes ）。

但是，您的标签是整数会有些许皱纹，而向量的其余部分都包含浮点值。 TensorFlow没有很多在二进制表示之间进行转换的功能（ tf.decode_raw()除外），因此您必须将字符串解码两次为不同的类型：

# Decode the result from bytes to int32
value_as_ints = tf.decode_raw(value, tf.int32, little_endian=True)
result.label = value_as_ints[0]

# Decode the result from bytes to float32
value_as_floats = tf.decode_raw(value, tf.float32, little_endian=True)
result.features = value_as_floats[1:1+NUM_FEATURES]

请注意，这仅适用于因为sizeof(tf.int32) == sizeof(tf.float32) ，通常情况下并非如此。 在更一般的情况下，一些更多的字符串操作工具将有助于切出原始value适当子字符串。 希望这足以使您前进。

在float32中读取二进制数据

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-02-06 02:26:13

在float32中读取二进制数据

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-02-06 02:26:13

解决方案1
2 已采纳 2016-02-06 02:26:13