繁体   English   中英

正则表达式查找<>

[英]Regular Expression to find <>

我有一个字符串

"Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"

我只想将 <> 部分与字符串分开。 我试过<.*>但它回来了

<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>

我不要他们之间的话。 我想要输出,

 ["<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64>", "<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>"]

任何帮助。 我被困在 python 中。

你需要一个负面的前瞻 此模式匹配,直到找到第一个>后面没有<

import re

text = "Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"

pattern = "<.*?>(?!<)"

print re.findall(pattern, text)
#['<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64>', '<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>']

您可以使用<.*?>(?!<)而不是<.*>

这是你可以做的

s = "Absolutely<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E64> Friendship goals exceeded here!! Sydney, Melbourne, Connecticut &amp; South Carolina<U+653C><U+3E64><U+613C><U+3E30><U+623C><U+3E64><U+653C><U+3E64><U+623C><U+3E31><U+383C><U+3E61>\r\n"

result = re.findall('<.*?>(?!<)',s)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM