简体   繁体   English

提取双引号之间的字符串

[英]Extract a string between double quotes

I'm reading a response from a source which is an journal or an essay and I have the html response as a string like:我正在阅读来自期刊或论文的来源的回复,并且我将 html 回复作为字符串,例如:

According to some, dreams express "profound aspects of personality" (Foulkes 184), though others disagree.根据一些人的说法,梦表达了“人格的深刻方面”(Foulkes 184),尽管其他人不同意。

My goal is just to extract all of the quotes out of the given string and save each of them into a list.我的目标只是从给定字符串中提取所有引号并将它们中的每一个保存到一个列表中。 My approach was:我的方法是:

[match.start() for m in re.Matches(inputString, "\"([^\"]*)\""))]

Somehow it didn't work for me.不知何故,它对我不起作用。 Any helps on my regex here?对我的正则表达式有任何帮助吗? Thanks a lot.非常感谢。

Provided there are no nested quotes:如果没有嵌套引号:

re.findall(r'"([^"]*)"', inputString)

Demo:演示:

>>> import re
>>> inputString = 'According to some, dreams express "profound aspects of personality" (Foulkes 184), though others disagree.'
>>> re.findall(r'"([^"]*)"', inputString)
['profound aspects of personality']

Use this one if your input can have something like this: some "text \\" and text" more如果您的输入可以是这样的,请使用这个: some "text \\" and text" more

s = '''According to some, dreams express "profound aspects of personality" (Foulkes 184), though others disagree.'''
lst = re.findall(r'"(.*?)(?<!\\)"', s)
print lst

Using (?<!\\\\) negative lookbehind it is checking there is no \\ before the "使用(?<!\\\\)否定的lookbehind它检查"之前没有\\

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM