简体   繁体   English

python-字符串中带有引号的正则表达式

[英]python - regex with quotes inside of the string

-- -

Hi everyone, 嗨,大家好,

I need a hand for the following regex. 我需要以下正则表达式的帮助。 The string is something like: 该字符串类似于:

str = 'value=\"20\" />\r\n\t\r\n<\/div>","whatiwant":"<div id=\"whatiwant\">\r\n\t\r\n\t\t<\/div>","idontwanthat":"<div id=\"idontwanthat\">\r\n\t\r\n\t blablalblalblalbla \t\r\n\t\t\t<\/div>"'

I would like the entire div of "whatiwant". 我想要“ whatiwant”的整个div。 I tried the following: 我尝试了以下方法:

matches=re.findall(r'\"whatiwant\":\"(.+?)\":\"',mstr)

ps: i can have other div in the div. ps:我可以在div中有其他div。

Any help with me appreciated 对我的任何帮助表示赞赏

Try using a positive lookahead - 尝试使用积极的前瞻 -

\"whatiwant\":.*(?=,\".*?\"\:)

DEMO DEMO

"whatiwant":"(.*?[^\\])??"

This will match the literal "whatiwant": and then anything (even an empty string) inside double quotes "" . 这将与字面量"whatiwant":匹配"whatiwant":然后在双引号""包含任何内容(甚至是空字符串)。

If you want to extract the div's html code, you can retrieve the first group's value: 如果要提取div的html代码,则可以检索第一组的值:

matches=re.findall(r'"whatiwant":"(.*?[^\\])??"', mstr)
for match in matches:
    html= match.group(1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM