简体   繁体   中英

Tensorflow - Sparse embedding lookup that remains sparse

I'm implementing a text classifier with a CNN similar to Kim 2014 with Tensorflow. Tensorflow provides tf.nn.embedding_lookup_sparse , which allows you to provide the word IDs as a sparse tensor. This is nice, especially for enabling variable length sequences. However, this function requires a "combination" step after the lookup, such as "mean" or "sum". This coerces it back to the dense tensor space. I don't want to do any combination. I want to keep my vectors in the sparse representation, so I can do other convolutions afterwards. Is this possible in TF?

EDIT: I want to avoid padding the input prior to the embedding lookup. This is because Tensorflow's embedding lookup generates vectors for the pad value, and its a kludge trying to mask it with zeros (see here) .

I think there are two points of confusion in the question. Firstly, the combiner operation happens across the set of embedding IDs for each row of the sparse indices input sp_ids . So if sp_ids has a shape of N x 1, then you are "combining" just one embedding vector per each row of sp_ids , which will just retrieve that embedding vector (which is I think what you are saying you want).

Secondly though, the return value is the embedding vector for each row of input. The embedding vector itself is a dense vector, by very definition of what the embedding is and what the TensorFlow embedding operations calculate. So this return result will always be dense, and that's what you want. A sparse matrix representation would be horribly inefficient, since the matrix truly will be dense (full of dense embeddings), regardless of whether any 'combiner' operation happens or not.

The research paper you linked does not seem to be doing any type of special methodology that would result in a special case of a sparse embedding vector, so I don't see a reason here for expecting or desiring sparse outputs.

Maybe I am incorrect, can you provide more details about why you expect the embedding vectors themselves to be sparse vectors? That would be a highly unusual situation if so.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM