简体   繁体   English

使用 tensorflow 提取 ELMo 特征并将其转换为 numpy

[英]Extracting ELMo features using tensorflow and convert them to numpy

So I am interested to extract sentence embeddings using ELMo model.所以我有兴趣使用 ELMo model 提取句子嵌入。

I tried this at first:我一开始试过这个:

import tensorflow as tf
import tensorflow_hub as hub
import numpy as np

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

x = ["Hi my friend"]

embeddings = elmo_model(x, signature="default", as_dict=True)["elmo"]


print(embeddings.shape)
print(embeddings.numpy())

It works well until the last line, that I could not convert it to numpy array.它运行良好,直到最后一行,我无法将它转换为 numpy 数组。

I searched a little and I found if I put the following line in the beginning of my codes, the problem must be solved.我搜索了一下,发现如果我将以下行放在代码的开头,则问题必须解决。

tf.enable_eager_execution()

However, I put this at the beginning of my code, I realized I could not compile the但是,我把它放在代码的开头,我意识到我无法编译

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

I received this error:我收到了这个错误:

 Exporting/importing meta graphs is not supported when eager execution is enabled. No graph exists when eager execution is enabled.

How can I solve my problem?我该如何解决我的问题? My goal is to obtain sentence features and use them in NumPy array.我的目标是获取句子特征并在 NumPy 数组中使用它们。

Thanks in advance提前致谢

TF 2.x TF 2.x

TF2 behavior is closer to the classic python behavior, because it defaults to eager execution. TF2 行为更接近经典的 python 行为,因为它默认为急切执行。 However, you should use hub.load to load your model in TF2.但是,您应该使用hub.load在 TF2 中加载 model。

elmo = hub.load("https://tfhub.dev/google/elmo/2").signature["default"]
x = ["Hi my friend"]
embeddings = elmo(tf.constant(x))["elmo"]

Then, you can access the results and convert them to numpy array using the numpy method.然后,您可以访问结果并使用numpy方法将它们转换为 numpy 数组。

>>> embeddings.numpy()
array([[[-0.7205108 , -0.27990735, -0.7735629 , ..., -0.24703965,
         -0.8358178 , -0.1974785 ],
        [ 0.18500198, -0.12270843, -0.35163105, ...,  0.14234722,
          0.08479916, -0.11709933],
        [-0.49985904, -0.88964033, -0.30124515, ...,  0.15846594,
          0.05210422,  0.25386307]]], dtype=float32)

TF 1.x TF 1.x

If using TF 1.x, you should run the operation inside a tf.Session .如果使用 TF 1.x,您应该在tf.Session中运行该操作。 TensorFlow does not use eager execution and requires to first build the graph, and then evaluate the results inside a session. TensorFlow 不使用急切执行,需要先构建图,然后在 session 中评估结果。

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
x = ["Hi my friend"]
embeddings_op = elmo_model(x, signature="default", as_dict=True)["elmo"]
# required to load the weights into the graph
init_op = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init_op)
    embeddings = sess.run(embeddings_op)

In that case, the result will already be a numpy array:在这种情况下,结果将已经是 numpy 数组:

>>> embeddings
array([[[-0.72051036, -0.27990723, -0.773563  , ..., -0.24703972,
         -0.83581805, -0.19747877],
        [ 0.18500218, -0.12270836, -0.35163072, ...,  0.14234722,
          0.08479934, -0.11709933],
        [-0.49985906, -0.8896401 , -0.3012453 , ...,  0.15846589,
          0.05210405,  0.2538631 ]]], dtype=float32)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM