[英]How to perform regex operations on a string tensor on TensorFlow?
How can I perform regex operations on a string tensor? 如何在字符串张量上执行正则表达式操作? Normally, I would just use a python string but when using Tensorflow Serving, I need my input to be a string tensor.
通常,我只会使用python字符串,但是当使用Tensorflow Serving时,我需要输入为字符串张量。 So I created a string placeholder and am just injecting another layer into the graph where I take the placeholder and make it ready for the passing it to the model.
因此,我创建了一个字符串占位符,然后将另一层注入到图形中,并在其中插入占位符,并准备将其传递给模型。
I have looked at using py_func
but I still cannot perform pattern operations on a bytes-like object. 我已经看过使用
py_func
但是仍然无法对类似字节的对象执行模式操作。
Is there any way of performing these operations on a tensor? 有没有办法在张量上执行这些操作? I cannot do an eval() on the placeholder because the data is only fed in when the savedModel is loaded and run.
我无法在占位符上执行eval(),因为仅当saveModel加载并运行时才提供数据。
Code I have been using for testing: 我一直用于测试的代码:
def remove_urls(vTEXT):
vTEXT = re.sub(r'(https|http)?:\/\/(\w|\.|\/|\?|\=|\&|\%)*\b', 'url', vTEXT, flags=re.MULTILINE)
return(vTEXT)
input_string_ph = tf.constant("This is string https:www.someurl.com")
input_string_lower = tf.py_func(lambda x: x.lower(), [input_string_ph], tf.string, stateful=False)
# input_string_no_url = tf.py_func(lambda x: remove_urls(x), [input_string_lower], tf.string, stateful=False)
sess = tf.InteractiveSession()
print (input_string_no_url.eval())
it seems that the String tensor return a byte value instead of string value in py_func
, so inside remove_urls
, you should use decode
似乎String张量返回一个字节值而不是
py_func
的字符串值,因此在remove_urls
,您应该使用decode
def remove_urls(vTEXT):
vTEXT = vTEXT.decode('utf-8')
vTEXT = re.sub(r'(https|http)?:\/\/(\w|\.|\/|\?|\=|\&|\%)*\b', 'url', vTEXT, flags=re.MULTILINE)
return(vTEXT)
Eg you can remove a sub-string from a string and check if you succeeded, using the tf.regex_replace()
operator like this: 例如,您可以使用
tf.regex_replace()
运算符从字符串中删除子字符串并检查是否成功:
import tensorflow as tf
str = tf.constant("your string")
sub_str = tf.constant("string")
def not_contains(str1, str2):
cut1 = tf.regex_replace(str1, str2, "")
split1 = tf.string_split([cut1], "")
split2 = tf.string_split([str1], "")
size1 = tf.size(split1)
size2 = tf.size(split2)
return tf.equal(size1, size2)
is_not_in = not_contains(str, sub_str)
sess = tf.Session()
sess.run(is_not_in) # False
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.