[英]Regular Expression to find <>
I have a string我有一个字符串
"Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut & South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"
I want to separate only the <> part from the string.我只想将 <> 部分与字符串分开。 I tried
<.*>
but it returned我试过
<.*>
但它回来了
<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut & South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>
I dont want the words between them.我不要他们之间的话。 I want the output as,
我想要输出,
["<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64>", "<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>"]
Any help.任何帮助。 I'm stuck in python.
我被困在 python 中。
You need a negative lookahead .你需要一个负面的前瞻。 This pattern matches until it finds the first
>
not followed by a <
:此模式匹配,直到找到第一个
>
后面没有<
:
import re
text = "Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut & South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"
pattern = "<.*?>(?!<)"
print re.findall(pattern, text)
#['<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64>', '<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>']
You can use <.*?>(?!<)
instead of <.*>
.您可以使用
<.*?>(?!<)
而不是<.*>
。
This is what you may do这是你可以做的
s = "Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut & South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"
result = re.findall('<.*?>(?!<)',s)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.