正则表达式查找<>

Question

I have a string我有一个字符串

"Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"

I want to separate only the <> part from the string.我只想将 <> 部分与字符串分开。 I tried <.*> but it returned我试过<.*>但它回来了

<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>

I dont want the words between them.我不要他们之间的话。 I want the output as,我想要输出，

 ["<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64>", "<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>"]

Any help.任何帮助。 I'm stuck in python.我被困在 python 中。

Answer 1

You need a negative lookahead .你需要一个负面的前瞻。 This pattern matches until it finds the first > not followed by a < :此模式匹配，直到找到第一个>后面没有< ：

import re

text = "Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"

pattern = "<.*?>(?!<)"

print re.findall(pattern, text)
#['<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64>', '<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>']

Answer 2

You can use <.*?>(?!<) instead of <.*> .您可以使用<.*?>(?!<)而不是<.*> 。

This is what you may do这是你可以做的

s = "Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"

result = re.findall('<.*?>(?!<)',s)

正则表达式查找<>

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-02-10 16:15:02

解决方案2
0 2017-02-10 16:07:52

正则表达式查找&lt;&gt;

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-02-10 16:15:02

解决方案2 0 2017-02-10 16:07:52

正则表达式查找<>

解决方案1
1 已采纳 2017-02-10 16:15:02

解决方案2
0 2017-02-10 16:07:52