简体   繁体   中英

How can I get around Keras pad_sequences() rounding float values to zero?

So I have a text classification model built with Keras. I've been trying to pad my varying length sequences but the Keras function pad_sequences() has just returned zeros.

I've figured out that if you have a numpy array like the one below, it works just fine. But once the elements become floats or decimals like the second array it just turns to zeros.

x = [[1, 2], [3,4,5], [4], [7,8,9,10]]
print pad_sequences(x, padding='post')

outputs:

[[ 1  2  0  0]
 [ 3  4  5  0]
 [ 4  0  0  0]
 [ 7  8  9 10]]

But

x = [[.1, .2], [.3,.4,.5], [.4], [.7,.8,.9,.010]]
print pad_sequences(x, padding='post')

outputs:

[[ 0  0  0  0]
 [ 0  0  0  0]
 [ 0  0  0  0]
 [ 0  0  0  0]]

And this:

x = [[.1, .2], [.3,.4,.5], [.4], [.7,.8,.9,.010]]
print pad_sequences(x, padding='post', value=99)

outputs:

[[ 0  0 99 99]
 [ 0  0  0 99]
 [ 0 99 99 99]
 [ 0  0  0  0]]

So I guess this function just ignores floats/decimals. Is there a way I can get around this?

It is caused by the fact that the default data type considered in the pad_sequences function is int32 . Therefore, all the values will be casted to integer (and in this case become zero). To resolve this, pass dtype='float32' argument:

pad_sequences(x, padding='post', value=99, dtype='float32')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM